In this script, we will be modelling the causal relation between effort and correction to confirm/reject our hypothesis.
These are:
H1: In corrections, people enhance some effort-related kinematic and/or acoustic features of their behaviour relative to the baseline.
H2: The enhancement depends on similarity of the guesser’s answer and the original meaning. More similar answer will require/result in smaller enhancement (but still enhancement) than less similar answer.
To assess the design and performance of the model, we will use synthetic data that we create to have certain interdependencies and where the effects have pre-defined values. We use this approach instead of using our dyad 0 data, because the pilot data do not include enough data to have a sensible testing sample.
Since we cannot predict the outcomes of the XGBoost modelling (in the previous script) and therefore do not know the variables we will model after preregistration, our model only assumes a continuous variable but it is otherwise free of any other assumptions.
library(here)
library(dplyr) # for data-wrangling
library(lme4) # for linear mixed-effects models
library(tidyr) # for reshaping data (if needed)
library(ggplot2)
library(tibble)
library(rcartocolor)
library(ggdag) # for dag
library(dagitty)
library(ggplot2) # bayesian stuff
library(patchwork)
library(bayesplot)
library(brms)
library(beepr)
library(bayestestR)
library(tidyverse)
# current folder (first go to session -> set working directory -> to source file location)
parentfolder <- dirname(getwd())
datasets <- paste0(parentfolder, '/09_Analysis_Modeling/datasets/')
models <- paste0(parentfolder, '/09_Analysis_Modeling/models/')
plots <- paste0(parentfolder, '/09_Analysis_Modeling/plots/')
#TODO
Because current pilot data are quite likely unsifficient in terms of power, and would hinder testing the design of the statistical models, we will create synthetic data with assumed coefficients for all predictors we plan to add to the model.
# Set seed for reproducibility
set.seed(0209)
# Set coefficients
b_exp_vocal <- 0.6 # Vocal has lower expressibility
b_exp_multi <- 1.5 # Multimodal has higher expressibility
b_bif <- 1.15 # More extroverted → more effort
b_fam <- 1.10 # More familiarity → more effort
b_exp <- 1.20 # More expressibility → more effort
b_multi <- 0.70 # Multimodal → slightly reduced effort
b_comatt2 <- 1.50 # Effort increases for second attempt
b_comatt3 <- 0.50 # Effort decreases for third attempt
b_prevan <- 0.50 # Higher previous answer similarity → less effort
# Define participants and concepts
n_participants <- 120
n_total_concepts <- 21 # Each participant works with all 21 concepts
n_modalities <- 3 # Gesture, vocal, combined
# Generate participant-level data
participants <- 1:n_participants
Big5 <- runif(n_participants, min = 0, max = 2)
Familiarity <- runif(n_participants, min = 0, max = 2)
# Generate expressibility scores for each concept across modalities
expressibility_matrix <- matrix(runif(n_total_concepts * n_modalities, min = 0, max = 1),
nrow = n_total_concepts, ncol = n_modalities)
# Initialize data storage
final_data_synt <- data.frame()
# Define function to simulate participant data
simulate_participant <- function(participant_id) {
participant_data <- data.frame()
trial_number <- 1
for (concept_id in 1:n_total_concepts) {
# Assign a random modality
modality <- sample(c("gesture", "vocal", "combined"), 1)
# Get expressibility score based on modality
expressibility_score <- ifelse(modality == "vocal", expressibility_matrix[concept_id, 1] * b_exp_vocal,
ifelse(modality == "gesture", expressibility_matrix[concept_id, 2],
expressibility_matrix[concept_id, 3] * b_exp_multi))
# Base effort before adjustments
base_effort <- b_bif * Big5[participant_id] +
b_fam * Familiarity[participant_id] +
b_exp * expressibility_score +
rnorm(1, mean = 1, sd = 0.5)
# Adjust effort for multimodal condition
if (modality == "combined") {
base_effort <- base_effort * b_multi
}
# Sample the number of communicative attempts (CommAtt)
adjusted_prob <- c(1 - Familiarity[participant_id],
1 - Familiarity[participant_id],
1 - Familiarity[participant_id]) *
c(1 - Big5[participant_id],
1 - Big5[participant_id],
1 - Big5[participant_id]) *
c(1 - expressibility_score,
1 - expressibility_score,
1 - expressibility_score)
adjusted_prob <- adjusted_prob / sum(adjusted_prob)
n_attempts <- sample(1:3, 1, prob = adjusted_prob)
prev_answer_similarity <- NA # First attempt has no previous similarity
# Generate rows for each communicative attempt
for (attempt in 1:n_attempts) {
Eff <- base_effort # Start with base effort
# Modify effort for second and third attempts
if (attempt == 2) {
Eff <- Eff * b_comatt2
} else if (attempt == 3) {
Eff <- Eff * b_comatt3
}
# Adjust effort based on previous answer similarity
if (!is.na(prev_answer_similarity)) {
Eff <- Eff * (1 + (1 - prev_answer_similarity) * b_prevan)
}
# Store row
participant_data <- rbind(participant_data, data.frame(
Participant = participant_id,
Concept = concept_id,
Modality = modality,
Big5 = Big5[participant_id],
Familiarity = Familiarity[participant_id],
Expressibility = expressibility_score,
CommAtt = attempt,
Eff = Eff,
TrialNumber = trial_number,
PrevAn = prev_answer_similarity
))
# Update for next attempt
trial_number <- trial_number + 1
prev_answer_similarity <- runif(1, min = 0, max = 1) # Simulate next similarity
}
}
return(participant_data)
}
# Simulate data for all participants
final_data_synt <- do.call(rbind, lapply(participants, simulate_participant))
# Preview results
head(final_data_synt)
## Participant Concept Modality Big5 Familiarity Expressibility CommAtt
## 1 1 1 combined 1.919899 1.843906 0.5761723 1
## 2 1 1 combined 1.919899 1.843906 0.5761723 2
## 3 1 1 combined 1.919899 1.843906 0.5761723 3
## 4 1 2 gesture 1.919899 1.843906 0.3493787 1
## 5 1 2 gesture 1.919899 1.843906 0.3493787 2
## 6 1 2 gesture 1.919899 1.843906 0.3493787 3
## Eff TrialNumber PrevAn
## 1 4.343849 1 NA
## 2 6.566250 2 0.98450637
## 3 2.367010 3 0.82035722
## 4 5.497853 4 NA
## 5 10.367605 5 0.48565959
## 6 3.998448 6 0.09090184
So now we have synthetic data where we exactly defined what is the relationship between certain variables
The effort for second attempt is multiplied by 1.50 (Beta = 1.50) The effort for third attempts by 0.7 (ie decreases, Beta = 0.5)
Beta = 1.10 for Eff Negative effect on CommAtt
Beta = 1.15 for Eff Negative effect on CommAtt
Beta = 1.20 for Eff Negative effect on CommAtt
(For simplicity reasons, we do not create an affect of trial number, as it does not represent a crucial variable)
Note that the synthetic data do not copy the structure of the data and the experimental design perfectly, but it does provide a reliable ground to build the models and test their performance.
nrow(final_data_synt) # this is the number of datapoints
## [1] 5076
hist(final_data_synt$Eff)
final_data_synt |>
janitor::tabyl(Participant, Concept) # the number of repetition per concept by participant
## Participant 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
## 1 3 3 3 2 1 3 1 3 3 1 1 1 1 2 3 1 3 2 2 2 3
## 2 3 2 1 2 2 3 3 1 3 2 3 1 1 3 3 2 2 1 1 2 1
## 3 1 2 2 3 2 3 2 3 2 1 3 2 3 1 1 2 1 3 3 2 1
## 4 3 3 1 3 3 3 1 1 1 1 1 3 2 2 2 1 3 1 1 1 3
## 5 2 2 1 3 3 2 3 3 2 1 1 1 2 1 2 2 3 1 3 1 3
## 6 3 2 2 3 1 1 3 1 1 2 1 2 1 3 3 3 2 3 1 1 1
## 7 1 1 1 3 2 2 3 1 3 3 2 1 3 3 3 3 2 1 3 2 1
## 8 2 2 2 2 1 3 1 2 2 1 1 3 1 3 3 2 1 3 2 1 3
## 9 2 1 3 2 2 3 3 1 1 1 2 3 3 3 2 3 1 3 1 2 2
## 10 2 3 2 1 3 3 1 1 1 3 1 1 3 1 3 3 3 1 1 3 1
## 11 3 3 3 1 2 1 3 1 1 2 1 3 2 2 2 1 3 3 3 3 1
## 12 2 2 3 2 3 3 2 1 2 3 1 2 2 3 2 2 2 2 1 3 1
## 13 2 2 3 3 2 3 3 3 2 3 3 1 3 2 3 1 2 1 1 2 3
## 14 2 1 3 2 2 3 2 2 3 3 2 3 1 2 2 3 3 3 3 3 1
## 15 2 3 1 2 1 3 1 1 3 1 3 3 1 2 1 1 3 2 1 2 3
## 16 1 3 2 2 1 3 1 3 3 1 2 2 1 1 2 2 2 1 1 3 3
## 17 1 1 2 1 3 3 3 1 3 3 3 3 3 2 3 3 1 1 3 3 1
## 18 1 1 1 1 2 1 1 1 3 1 2 1 2 3 2 2 2 2 1 1 2
## 19 2 1 2 3 2 1 1 1 1 3 1 1 3 3 3 3 3 3 1 3 2
## 20 2 2 1 3 1 1 3 3 2 1 3 3 2 2 3 2 1 3 2 2 2
## 21 2 2 3 1 1 1 1 3 2 1 2 3 2 3 1 3 3 2 3 1 3
## 22 3 1 1 1 3 3 2 1 2 3 1 2 1 2 1 1 2 3 3 1 1
## 23 2 2 1 1 1 2 2 1 3 2 2 2 3 3 1 3 2 3 2 1 2
## 24 3 2 1 3 3 1 3 1 2 3 3 2 3 2 1 2 1 2 3 3 1
## 25 2 2 1 1 3 1 3 2 1 3 2 3 3 2 3 1 3 1 1 3 3
## 26 2 2 2 2 1 1 2 1 2 2 1 1 2 3 3 3 1 2 1 2 3
## 27 3 2 3 1 3 1 2 1 1 3 2 3 1 3 2 1 1 2 1 3 2
## 28 2 2 1 3 2 1 1 1 3 2 1 2 3 3 3 1 3 1 2 1 1
## 29 2 3 2 2 2 3 3 2 1 2 1 2 3 2 2 3 3 2 3 2 2
## 30 1 3 3 1 2 3 3 2 1 2 3 1 3 1 1 1 2 2 3 3 3
## 31 1 3 3 3 3 3 2 3 1 3 1 2 2 2 1 2 3 3 2 2 3
## 32 3 2 2 1 3 2 2 1 3 1 1 2 2 3 2 3 1 1 3 2 2
## 33 2 1 2 3 1 1 1 1 2 2 3 2 2 1 1 3 1 3 2 2 1
## 34 1 1 3 1 1 2 2 2 1 3 3 1 1 3 2 3 2 2 3 1 3
## 35 1 3 3 3 1 1 2 3 1 2 1 2 3 2 1 1 3 1 1 1 3
## 36 2 2 3 2 3 3 3 1 1 1 3 1 3 1 3 3 3 3 1 3 1
## 37 1 1 1 3 1 2 2 1 3 3 3 2 3 1 2 2 3 3 2 1 1
## 38 3 1 3 3 1 3 3 1 3 1 2 1 3 1 3 1 3 3 3 2 1
## 39 3 1 1 3 2 1 2 2 1 2 1 2 1 1 2 3 1 1 1 2 1
## 40 1 3 1 1 3 2 1 2 1 2 2 3 3 3 3 2 1 1 2 2 1
## 41 3 1 3 1 1 2 2 1 3 2 2 1 3 2 1 3 3 3 2 1 3
## 42 1 1 2 3 2 3 3 2 3 2 3 3 2 2 1 3 2 2 1 2 1
## 43 1 1 2 3 1 2 1 1 3 2 2 3 1 1 2 3 1 1 2 1 2
## 44 2 1 1 1 2 3 1 1 1 3 3 2 3 2 3 3 3 3 3 2 2
## 45 2 2 3 3 3 1 3 2 3 3 2 1 3 1 1 3 1 1 2 1 2
## 46 1 1 3 1 1 2 1 2 1 1 1 1 1 1 1 1 3 3 2 3 2
## 47 2 1 3 2 1 2 1 1 1 3 1 3 2 1 1 3 3 2 1 2 3
## 48 2 3 2 3 1 2 3 3 1 1 1 1 2 2 3 2 2 2 3 3 3
## 49 2 1 1 3 3 3 2 2 2 3 1 1 2 3 1 1 2 1 3 2 3
## 50 2 1 2 2 1 3 3 3 3 3 2 3 2 3 2 2 1 3 3 2 3
## 51 1 2 2 3 2 2 2 3 3 1 3 3 2 3 3 1 1 3 3 3 3
## 52 3 1 2 2 1 3 1 1 1 2 2 3 3 3 1 3 2 3 2 1 2
## 53 1 3 2 3 3 1 2 2 2 2 1 3 2 1 1 1 1 3 2 1 3
## 54 3 3 3 1 3 3 1 1 3 1 2 2 1 3 3 1 2 3 3 3 1
## 55 1 1 2 2 1 3 2 1 1 1 3 2 2 3 2 2 1 1 2 2 3
## 56 3 2 2 3 2 3 3 1 2 3 2 2 2 3 3 1 2 1 1 1 1
## 57 3 3 3 1 3 2 2 1 2 2 2 2 3 3 3 3 3 3 3 1 1
## 58 3 2 1 3 2 2 3 1 3 1 2 2 3 3 1 3 1 3 2 3 1
## 59 3 1 2 3 2 1 3 1 3 2 2 2 2 3 1 1 1 1 1 1 2
## 60 2 1 1 2 2 3 2 2 3 1 3 1 2 1 3 1 3 3 3 3 2
## 61 2 3 3 3 2 2 2 1 3 2 1 3 3 3 3 1 3 2 2 2 1
## 62 3 1 2 3 1 3 2 3 1 3 2 1 3 1 1 1 1 2 3 1 1
## 63 2 2 1 3 3 1 3 2 1 2 2 3 2 1 1 1 3 2 3 2 2
## 64 2 2 3 2 3 3 2 1 1 3 1 1 3 3 2 3 3 2 2 1 3
## 65 1 1 1 2 2 1 1 2 2 1 3 3 1 2 2 3 2 2 2 3 3
## 66 1 3 1 1 3 2 1 2 2 3 1 3 1 3 1 3 3 3 2 2 2
## 67 3 2 1 1 2 3 1 2 1 2 2 3 1 3 3 3 2 1 3 3 2
## 68 1 1 3 2 3 2 2 3 2 2 2 2 3 2 1 2 1 3 3 1 2
## 69 2 2 1 2 2 1 3 1 2 3 1 3 2 3 3 2 3 2 2 1 2
## 70 2 1 3 1 1 1 1 2 3 2 3 3 3 2 2 1 2 3 2 3 2
## 71 2 2 2 3 1 1 2 3 3 1 1 2 3 1 3 3 3 2 1 2 1
## 72 3 1 3 2 1 1 2 1 3 3 1 2 1 3 1 2 2 2 1 3 1
## 73 2 3 1 1 1 3 2 2 3 2 3 3 3 2 1 2 1 2 3 3 1
## 74 3 3 1 2 2 3 2 1 1 1 1 2 1 2 1 2 3 3 2 3 3
## 75 1 1 2 3 3 3 3 2 1 2 3 3 3 3 3 1 3 3 1 1 1
## 76 2 1 3 3 3 3 3 1 3 2 1 3 2 3 2 2 3 2 1 2 2
## 77 1 3 2 1 2 2 1 3 3 1 2 2 2 2 2 2 2 1 2 1 1
## 78 3 1 3 3 2 3 1 2 3 2 2 2 2 3 2 1 1 1 2 2 2
## 79 3 2 1 2 1 3 1 2 3 3 2 1 2 2 1 3 3 3 2 2 2
## 80 2 2 1 1 1 1 2 3 2 3 3 3 2 1 1 1 2 3 1 3 2
## 81 2 2 2 1 3 2 3 1 1 3 2 2 2 3 2 3 3 1 1 3 3
## 82 1 2 2 2 3 3 3 3 1 3 1 2 3 2 2 3 3 1 2 3 3
## 83 3 3 2 2 2 2 1 1 3 2 3 1 1 1 3 1 2 1 1 3 1
## 84 3 1 1 3 1 1 1 1 3 1 1 2 3 2 1 1 1 3 3 1 3
## 85 2 3 1 3 1 1 1 1 3 3 3 3 1 3 3 2 2 2 2 1 2
## 86 3 2 3 2 1 3 3 2 3 3 3 3 2 1 1 3 1 1 1 1 1
## 87 2 2 3 2 1 2 2 1 3 3 3 3 1 1 1 3 1 3 3 3 2
## 88 2 1 1 2 1 1 1 2 2 3 2 1 1 2 1 1 3 1 3 3 1
## 89 1 2 3 3 3 2 3 2 2 3 1 1 3 3 3 2 2 3 1 1 3
## 90 1 3 2 2 3 3 1 1 3 2 2 3 2 2 2 2 1 1 1 1 3
## 91 2 3 1 3 1 2 2 1 2 2 2 3 1 3 1 3 1 3 2 1 1
## 92 2 2 2 3 2 3 2 2 2 1 2 1 2 1 2 2 2 1 1 1 3
## 93 2 1 1 2 2 3 1 3 1 1 3 3 1 3 1 2 2 1 3 3 3
## 94 2 2 1 2 3 2 3 3 1 2 1 3 2 2 1 2 1 3 1 3 3
## 95 2 1 1 1 3 2 3 3 3 3 1 1 2 3 3 3 3 2 3 3 3
## 96 3 2 1 1 2 1 1 2 3 1 1 2 1 1 2 2 3 2 2 2 2
## 97 1 2 1 2 3 1 1 1 1 2 3 3 3 3 2 3 2 3 1 2 2
## 98 3 1 1 1 2 2 2 1 1 2 2 2 1 1 1 1 2 1 2 1 2
## 99 3 1 3 1 3 1 3 3 3 3 2 3 2 3 2 3 3 2 2 2 2
## 100 3 3 3 1 3 3 3 3 2 2 2 2 3 2 2 2 2 1 2 3 1
## 101 1 1 2 2 1 2 3 1 1 2 1 3 2 1 1 3 2 2 3 1 2
## 102 3 3 3 1 2 2 2 3 1 2 3 3 2 3 3 2 2 1 1 2 3
## 103 2 2 1 2 1 2 1 3 2 2 3 2 3 1 1 1 1 1 2 3 1
## 104 1 1 1 3 2 2 2 2 2 3 3 2 3 1 2 2 1 2 3 3 3
## 105 3 1 2 2 1 2 3 2 2 2 2 2 2 2 1 1 2 2 1 1 1
## 106 3 1 2 1 1 2 1 3 1 3 2 1 1 3 1 3 2 1 1 1 2
## 107 3 3 3 2 2 2 3 1 2 1 3 1 1 3 2 2 1 2 1 3 3
## 108 3 1 3 3 2 1 2 3 1 2 2 3 1 3 1 3 3 2 3 2 2
## 109 2 3 3 1 2 3 3 2 3 3 1 2 2 2 2 1 2 2 1 3 2
## 110 2 2 2 2 1 3 1 3 2 3 3 2 3 3 1 1 1 2 2 3 3
## 111 1 1 2 1 3 2 1 1 2 1 2 1 1 2 2 2 1 2 3 1 3
## 112 3 2 3 3 2 1 3 1 1 1 3 2 1 3 2 1 3 2 1 3 3
## 113 3 3 1 2 3 3 1 2 1 1 3 3 2 2 3 3 3 2 3 1 3
## 114 2 1 3 2 1 1 2 1 1 3 3 1 1 1 3 1 1 3 3 3 2
## 115 3 1 3 2 3 1 3 3 3 3 3 1 1 2 3 3 2 1 3 2 2
## 116 1 1 1 1 2 2 3 2 3 1 1 3 2 2 3 2 2 3 2 1 1
## 117 2 3 2 3 2 1 1 3 1 1 2 2 3 2 3 3 3 1 2 1 1
## 118 3 2 1 2 2 3 1 1 1 3 2 2 2 1 3 1 3 3 1 2 3
## 119 1 3 1 2 2 2 1 2 1 2 2 3 3 1 2 1 3 3 2 3 1
## 120 1 1 3 1 1 3 2 1 3 3 2 2 1 3 3 2 1 2 3 1 1
# Create a boxplot comparing Effort across different Communicative Attempts
ggplot(final_data_synt, aes(x = as.factor(CommAtt), y = Eff)) +
geom_boxplot(aes(fill = as.factor(CommAtt))) +
labs(title = "Comparison of Effort Across Communicative Attempts",
x = "Communicative Attempts",
y = "Effort",
fill = "CommAtt") +
theme_minimal() +
theme(legend.position = "none")
The visual already hints on the effects we expect to exist in the
data.
# Filter out CommAtt == 1
filtered_data <- final_data_synt[final_data_synt$CommAtt != 1, ]
# Scatter plot with regression line
library(ggplot2)
ggplot(filtered_data, aes(x = PrevAn, y = Eff)) +
geom_point(alpha = 0.6, color = "blue") + # Scatter points
geom_smooth(method = "lm", color = "red", se = FALSE) + # Regression line
labs(x = "Previous Answer Similarity (PrevAn)",
y = "Effort (Eff)",
title = "Relationship between Effort and Previous Answer Similarity") +
theme_minimal()
Before doing that, we create a causal diagram, so-called directed acyclic graphs (DAG). This will help us clarify the causal model and introduce the predictors we aim to model.
Our two hypothesis go as follows:
H1: Correction recruits more physical effort than the baseline performance.
H2: A higher degree of misunderstanding will require a performer to engage in more effortful correction.
The relationship of interest is the causal effect of communicative attempt on effort. Our assumptions include:
Personality traits (measured with Big5) will influence effort (e.g., people are more willing to put more effort if they are open-minded) and also communicative attempt (e.g., more extroverted people are better in this game, therefore they need less attempts)
Familiarity with guessing partner influences effort (ditto) as well as communicative attempt (e.g., friends are better in this game than strangers)
Similarly, participant will also directly influence effort and commAtt, because personality traits might not be capturing all variation
Expressibility of concepts is going to influence effort (e.g., more expressible concepts allow more enhancement - but could be also other direction) as well as CommAtt (e.g., more expressible concepts are more readily guessed and don’t need more attempts)
Similarly, concept will also directly influence effort and commAtt, because expressibility might not be capturing all variation
Modality (uni vs. multi) will influence the value of effort. We assume that in unimodal condition, the feature does not need to account for synergic relations with the other modality, and may carry the whole signal. In multimodal condition, however, there may be synergic relations that moderate the full expressiveness of this feature
Trial number (characterising how far one is in the experiment) could be hinting on learning processes through out the experiment, or - the other direction - on increasing fatigue
Previous answer (PrevAn) will affect the effort (more similar answer will require less effortful correction) as well as communicative attempt (correct answers do not require further corrections)
Expressibility of concepts will influence the answer (more expressible are easier to guess)
Trial number will also affect previous answer if there is learning involved, or conversely, increasing fatigue
These are the implied conditional independencies
## Big5 _||_ Conc
## Big5 _||_ Expr
## Big5 _||_ Fam | Pcn
## Big5 _||_ Mod
## Big5 _||_ TrNm
## Conc _||_ Fam
## Conc _||_ Mod
## Conc _||_ Pcn
## Conc _||_ TrNm
## Expr _||_ Fam
## Expr _||_ Mod
## Expr _||_ Pcn
## Expr _||_ TrNm
## Fam _||_ Mod
## Fam _||_ TrNm
## Mod _||_ Pcn
## Mod _||_ TrNm
## Pcn _||_ TrNm
This is the adjustment set that needs to be included in the model to make sure we are not confounding
## { Big5, Conc, Expr, Fam, Mod, Pcn, TrNum }
In this part, we are going to test the following hypothesis:
H1: Correction recruits more physical effort than the baseline performance.
To be able to interpret the data, we will first need to do some contrast coding
final_data$CommAtt <- as.factor(final_data$CommAtt)
final_data$Modality <- as.factor(final_data$Modality)
final_data$Participant <- as.factor(final_data$Participant)
final_data$Concept <- as.factor(final_data$Concept)
final_data$TrialNumber <- as.numeric(final_data$TrialNumber) # Ensure TrialNumber is numeric
contrasts(final_data$CommAtt) <- MASS::contr.sdif(3)
contrasts(final_data$Modality) <- contr.sum(3)/2
This is how CommAtt is contrast-coded now
2-1 3-2
1 -0.6666667 -0.3333333 2 0.3333333 -0.3333333 3 0.3333333 0.6666667
This is how modality is cc-ed
[,1] [,2]
combined 0.5 0.0 gesture 0.0 0.5 vocal -0.5 -0.5
final_data$TrialNumber_c <- final_data$TrialNumber - median(range(final_data$TrialNumber))
final_data$Familiarity <- final_data$Familiarity - median(range(final_data$Familiarity))
final_data$Big5 <- final_data$Big5 - median(range(final_data$Big5))
# For now, we will just center Familiarity and Big5 because we created them synthetically. But the real data have these two variables already z-scored
final_data <-
final_data |>
group_by(Modality) |>
mutate(Expressibility_z = (Expressibility - mean(Expressibility))/ sd(final_data$Expressibility, na.rm = T)) |>
ungroup()
First we want to do a simple model that, however, reproduce our DAG.
Our main predictor is communicative attempt (CommAtt). To account for confounders - variables affecting both the predictor and the dependent variable - we need to adjust for them in the model by including them as covariates to isolate the effect of the predictor. Based on our DAG, confounders include:
We include 1.-5. as fixed factors. For 6.-7., we include varying intercepts as we expect that each participant and concept may have they own baseline level of effort and thus allow for individual variation. Partial pooling is also beneficial in that extreme values (or categories will fewer datapoints) will be shrunk toward the overal average.
We will now not include PrevAn (previous answer) variable because we will need to do some further data-wrangling when building model for H2. That is mainly because PrevAn has some NA values, concretely for first attempts. Models would automatically exclude NA data, and we would therefore loose all effort data for attempt 1. For H1, however, we want to keep it there.
This is OUR FIRST model without setting any priors, leaving them to default values
h1.m1 <- brm(Eff ~ 1 + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept),
data = final_data,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m1 <- add_criterion(h1.m1, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m1_R2 <- bayes_R2(h1.m1)
# Save both as objects
saveRDS(h1.m1, here("09_Analysis_Modeling", "models", "h1.m1.rds"))
saveRDS(h1.m1_R2, here("09_Analysis_Modeling", "models", "h1.m1_R2.rds"))
beep(5)
h1.m1 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m1.rds"))
h1.m1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m1_R2.rds"))
# Summary
summary(h1.m1)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.04 0.02 0.00 0.09 1.00 2122 1863
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.02 0.00 0.07 1.00 2320 3475
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.02 0.02 3.98 4.06 1.00 6694 5622
## CommAtt2M1 3.08 0.03 3.03 3.14 1.00 8963 6147
## CommAtt3M2 -4.39 0.04 -4.46 -4.32 1.00 8289 5880
## Familiarity 1.26 0.02 1.21 1.30 1.00 8380 5226
## Big5 1.29 0.02 1.25 1.34 1.00 9979 6018
## Expressibility_z 0.46 0.02 0.43 0.49 1.00 9888 5989
## TrialNumber_c -0.00 0.00 -0.00 0.00 1.00 7595 5984
## Modality1 -1.30 0.04 -1.38 -1.23 1.00 8559 5997
## Modality2 0.64 0.04 0.57 0.71 1.00 7347 5787
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.91 0.01 0.89 0.93 1.00 11791 5667
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Note on modality: Modality1 represents the baseline difference between combined and the other modalities, and Modality2 represents the difference between gesture and vocal compared to combined.
To be able to link these estimates back to the simulation coefficients, let’s create a function
transform_attempt <- function(intercept, CommAtt2M1, CommAtt3M2) {
# Effort for the first attempt (base effort)
Eff_attempt_1 <- intercept
# Effort for the second attempt
Eff_attempt_2 <- Eff_attempt_1 + CommAtt2M1
# Effort for the third attempt
Eff_attempt_3 <- Eff_attempt_1 + CommAtt2M1 + CommAtt3M2
# Calculate ratios
b_attempt2 <- Eff_attempt_2 / Eff_attempt_1
b_attempt3 <- Eff_attempt_3 / Eff_attempt_2
return(data.frame(b_attempt2, b_attempt3))
}
h1m1_coeff <- transform_attempt(4.02, 3.08, -4.39)
print(h1m1_coeff)
## b_attempt2 b_attempt3
## 1 1.766169 0.3816901
# For centered familiarity
fam = (4.02 + 1.26) / 4.02
print(fam)
## [1] 1.313433
# For centered BIF
bif = (4.02 + 1.29) / 4.02
print(bif)
## [1] 1.320896
# Expressibility is z-scored so we will not get to the coefficient in the same way but we can check the conditinal effects for checking whether it looks good
Let’s also check the visuals
plot(h1.m1)
# all caterpillars look nice
plot(conditional_effects(h1.m1), points = TRUE)
# the effects all go in good direction
pp_check(h1.m1, type = "dens_overlay")
# Looks good but not amazing - mostly because the posteriors seem to not know effort cannot be negative
pp_check(h1.m1, type = "error_scatter_avg")
# half blobby, half correlated, so still some room for improvement
# positive correlation means that errors increase with predicted values. So the model does perform well for some range, but becomes less reliable with increase in the predicted values
# also the blob is centered around 0 which is good
# it could be we are ignoring some interaction terms or non-linearity (which we know we kind of do). Transformation could also help (e.g., log). Of course, we are also still not specifying any priors so let's not yet make it a disaster
h1.m1_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8381496 0.001662319 0.8348364 0.8412852
# explained variance around 83%
Overall, we see good directions of all predictors, mostly also in accordance with the expected coefficients. Of course, the synthetic data is quite complex so there might be other dependencies that moderate the causal relationships and that is why we do not see exactly the numbers we use to create the data.
Let’s have another model for comparison.
The fixed predictors make sense like this, as it will allow us to assess the effect of each on the effort, despite it not being our main research question. But we can assume that participants and concept have not only different baselines of effort (varying intercept). The effect of CommAtt on effort might vary across them too, hence we can try to add varying slopes for them and see whether the diagnostics improves. We will also add TrialNumber as a varying intercept, because we expect variation between earlier and later performances (because of learning, or opposite, fatigue) and we do not really need a single coefficient for this predictor anyway.
h1.m2 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c),
data = final_data,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m2 <- add_criterion(h1.m2, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m2_R2 <- bayes_R2(h1.m2)
# Save both as objects
saveRDS(h1.m2, here("09_Analysis_Modeling", "models", "h1.m2.rds"))
saveRDS(h1.m2_R2, here("09_Analysis_Modeling", "models", "h1.m2_R2.rds"))
beep(5)
h1.m2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m2.rds"))
h1.m2_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m2_R2.rds"))
# Summary
summary(h1.m2)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.06 0.02 0.02 0.10 1.00 2791
## sd(CommAtt2M1) 0.17 0.04 0.11 0.26 1.00 3239
## sd(CommAtt3M2) 0.25 0.06 0.16 0.37 1.00 3386
## cor(Intercept,CommAtt2M1) 0.52 0.28 -0.14 0.93 1.00 1312
## cor(Intercept,CommAtt3M2) -0.35 0.32 -0.90 0.32 1.00 1261
## cor(CommAtt2M1,CommAtt3M2) -0.93 0.07 -1.00 -0.75 1.00 5691
## Tail_ESS
## sd(Intercept) 2048
## sd(CommAtt2M1) 4832
## sd(CommAtt3M2) 5133
## cor(Intercept,CommAtt2M1) 1617
## cor(Intercept,CommAtt3M2) 1450
## cor(CommAtt2M1,CommAtt3M2) 6716
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.15 0.04 0.08 0.22 1.00 901
## sd(CommAtt2M1) 0.78 0.06 0.68 0.90 1.00 2066
## sd(CommAtt3M2) 1.10 0.08 0.95 1.27 1.00 2076
## cor(Intercept,CommAtt2M1) 0.96 0.03 0.87 1.00 1.01 352
## cor(Intercept,CommAtt3M2) -0.96 0.04 -1.00 -0.85 1.01 353
## cor(CommAtt2M1,CommAtt3M2) -1.00 0.00 -1.00 -0.99 1.00 7139
## Tail_ESS
## sd(Intercept) 2511
## sd(CommAtt2M1) 3409
## sd(CommAtt3M2) 3623
## cor(Intercept,CommAtt2M1) 596
## cor(Intercept,CommAtt3M2) 690
## cor(CommAtt2M1,CommAtt3M2) 6841
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.02 0.02 0.00 0.06 1.00 3119 3942
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.03 0.02 3.99 4.08 1.00 2093 3494
## CommAtt2M1 3.07 0.08 2.90 3.23 1.00 1384 2570
## CommAtt3M2 -4.36 0.12 -4.59 -4.13 1.00 1411 2477
## Modality1 -1.29 0.03 -1.35 -1.23 1.00 8986 6966
## Modality2 0.64 0.03 0.58 0.71 1.00 9767 6458
## Big5 1.08 0.05 0.99 1.16 1.01 854 2264
## Familiarity 1.04 0.05 0.94 1.12 1.00 957 2173
## Expressibility_z 0.44 0.02 0.41 0.47 1.00 5445 5281
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.80 0.01 0.78 0.81 1.00 11536 6199
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged but there were some divergent transitions
plot(h1.m2)
# some some of the caterpillars are not that pretty anymore
plot(conditional_effects(h1.m2), points = TRUE)
# the effects all go in good direction
pp_check(h1.m2, type = "dens_overlay")
# Looks good but not amazing - mostly because the posteriors seem to not know effort cannot be negative
pp_check(h1.m2, type = "error_scatter_avg")
# This looks a bit more blobby than before but still lots of errors for higher values
h1.m2_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8760573 0.00125936 0.8734939 0.8784314
# explained variance around 87%
One of the reasons why we might be getting the divergent transition is because of the correlation between slopes and intercepts in the previous model. Let’s now get rid of it
h1.m3 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c),
data = final_data,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m3 <- add_criterion(h1.m3, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m3_R2 <- bayes_R2(h1.m3)
# Save both as objects
saveRDS(h1.m3, here("09_Analysis_Modeling", "models", "h1.m3.rds"))
saveRDS(h1.m3_R2, here("09_Analysis_Modeling", "models", "h1.m3_R2.rds"))
beep(5)
h1.m3 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3.rds"))
h1.m3_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3_R2.rds"))
# Summary
summary(h1.m3)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.05 0.02 0.01 0.10 1.00 2648 2360
## sd(CommAtt2M1) 0.09 0.05 0.01 0.19 1.01 1159 1540
## sd(CommAtt3M2) 0.17 0.06 0.04 0.30 1.00 1422 1517
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.02 0.00 0.08 1.00 2442 4123
## sd(CommAtt2M1) 0.71 0.05 0.61 0.82 1.00 2733 4516
## sd(CommAtt3M2) 1.05 0.08 0.91 1.21 1.00 2231 3994
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.02 0.02 0.00 0.06 1.00 3301 3712
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.03 0.02 3.99 4.06 1.00 9303 6039
## CommAtt2M1 3.07 0.07 2.93 3.22 1.00 2152 3434
## CommAtt3M2 -4.36 0.11 -4.58 -4.15 1.00 2100 3982
## Modality1 -1.30 0.03 -1.36 -1.24 1.00 11936 6326
## Modality2 0.65 0.03 0.58 0.71 1.00 12551 6401
## Big5 1.23 0.02 1.19 1.28 1.00 15911 6657
## Familiarity 1.20 0.02 1.15 1.24 1.00 14065 6574
## Expressibility_z 0.45 0.01 0.42 0.48 1.00 14842 5619
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.81 0.01 0.80 0.83 1.00 11557 5519
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
plot(h1.m3)
# no correlation fixed the performance
plot(conditional_effects(h1.m3), points = TRUE)
# the effects all go in good direction
pp_check(h1.m3, type = "dens_overlay")
# the problem with predicting negative values remains
pp_check(h1.m3, type = "error_scatter_avg")
# unchanged
h1.m3_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8714492 0.001445024 0.8685275 0.874215
# explained variance remains around 87%
priors_h1m3p <- c(
set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z"),
set_prior("normal(0.5,0.1)", class = "sd", group = "TrialNumber_c"),
set_prior("normal(0.5,0.1)", class = "sd", group = "Participant"),
set_prior("normal(0.5,0.1)", class = "sd", group = "Concept"),
set_prior("normal(1,0.1)", class = "sd"),
set_prior("normal(0.5,0.25)", class = "sigma")
)
h1.m3p <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c),
data = final_data,
prior=priors_h1m3p,
family = gaussian,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m3p <- add_criterion(h1.m3p, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m3p_R2 <- bayes_R2(h1.m3p)
# Save both as objects
saveRDS(h1.m3p, here("09_Analysis_Modeling", "models", "h1.m3p.rds"))
saveRDS(h1.m3p_R2, here("09_Analysis_Modeling", "models", "h1.m3p_R2.rds"))
beep(5)
h1.m3p <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3p.rds"))
h1.m3p_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m3p_R2.rds"))
# Summary
summary(h1.m3p)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.08 0.03 0.03 0.14 1.00 2732 4188
## sd(CommAtt2M1) 0.22 0.06 0.12 0.36 1.00 2853 4652
## sd(CommAtt3M2) 0.34 0.08 0.20 0.51 1.00 2590 5063
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.06 0.02 0.01 0.10 1.00 2220 2200
## sd(CommAtt2M1) 0.66 0.04 0.58 0.75 1.00 4131 5851
## sd(CommAtt3M2) 0.89 0.05 0.80 0.98 1.00 4638 6332
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.04 0.02 0.00 0.08 1.00 2687 3691
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.02 0.02 3.97 4.07 1.00 8469 6434
## CommAtt2M1 2.99 0.08 2.83 3.15 1.00 2795 4372
## CommAtt3M2 -4.12 0.12 -4.35 -3.87 1.00 2401 3679
## Modality1 -1.27 0.03 -1.34 -1.21 1.00 13767 6673
## Modality2 0.63 0.03 0.56 0.69 1.00 13131 6444
## Big5 1.23 0.02 1.18 1.28 1.00 12639 6019
## Familiarity 1.19 0.03 1.14 1.24 1.00 13068 6412
## Expressibility_z 0.45 0.02 0.42 0.48 1.00 15210 5985
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.81 0.01 0.80 0.83 1.00 14125 5732
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
plot(h1.m3p)
# all good
plot(conditional_effects(h1.m3p), points = TRUE)
# the effects all go in good direction
pp_check(h1.m3p, type = "dens_overlay")
# the problem with predicting negative values remains
pp_check(h1.m3p, type = "error_scatter_avg")
# unchanged
h1.m3p_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8705705 0.001478901 0.8676255 0.873398
# explained variance remains around 87%
Let’s do one more test with the priors, restricting sd them with exponential distribution, i.e., negative values are not possible
priors_h1m4 <- c(
set_prior("normal(3, 0.3)", class = "Intercept", lb = 0),
set_prior("normal(0, 0.25)", class = "b", coef = "CommAtt2M1"),
set_prior("normal(0, 0.25)", class = "b", coef = "CommAtt3M2"),
set_prior("normal(0, 0.15)", class = "b", coef = "Modality1"),
set_prior("normal(0, 0.15)", class = "b", coef = "Modality2"),
set_prior("normal(0, 0.15)", class = "b", coef = "Big5"),
set_prior("normal(0, 0.15)", class = "b", coef = "Familiarity"),
set_prior("normal(0, 0.15)", class = "b", coef = "Expressibility_z"),
# Exponential priors for standard deviations
set_prior("exponential(3)", class = "sd", group = "TrialNumber_c"), # exp(3) has a mean of 1/3 and concentrates most density around small values
set_prior("exponential(3)", class = "sd", group = "Participant"),
set_prior("exponential(3)", class = "sd", group = "Concept"),
set_prior("exponential(1)", class = "sd"), # Generic sd prior
# Residual standard deviation - keep it narrow
set_prior("normal(0.5, 0.1)", class = "sigma")
)
h1.m4 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c),
data = final_data,
prior=priors_h1m4,
family = gaussian,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m4 <- add_criterion(h1.m4, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m4_R2 <- bayes_R2(h1.m4)
# Save both as objects
saveRDS(h1.m4, here("09_Analysis_Modeling", "models", "h1.m4.rds"))
saveRDS(h1.m4_R2, here("09_Analysis_Modeling", "models", "h1.m4_R2.rds"))
beep(5)
h1.m4 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m4.rds"))
h1.m4_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m4_R2.rds"))
# Summary
summary(h1.m4)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.05 0.02 0.01 0.09 1.00 2376 2462
## sd(CommAtt2M1) 1.98 0.37 1.33 2.78 1.00 1362 2663
## sd(CommAtt3M2) 3.36 0.46 2.56 4.38 1.00 942 2031
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.03 0.02 0.00 0.08 1.00 2217 4211
## sd(CommAtt2M1) 0.71 0.05 0.61 0.82 1.00 2746 4486
## sd(CommAtt3M2) 1.05 0.08 0.92 1.22 1.00 2446 4360
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.02 0.02 0.00 0.06 1.00 3383 3814
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 4.02 0.02 3.99 4.06 1.00 9553 6582
## CommAtt2M1 0.88 0.30 0.31 1.47 1.00 4070 5104
## CommAtt3M2 -0.36 0.26 -0.87 0.14 1.00 10008 6008
## Modality1 -1.23 0.03 -1.29 -1.16 1.00 14208 6281
## Modality2 0.59 0.03 0.53 0.65 1.00 15010 6334
## Big5 1.21 0.02 1.16 1.25 1.00 13972 6575
## Familiarity 1.17 0.02 1.12 1.21 1.00 13947 5839
## Expressibility_z 0.44 0.01 0.41 0.47 1.00 17170 6516
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.81 0.01 0.79 0.83 1.00 15722 5841
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# the coefficients for commatt now seem much lower
h1m1_coeff <- transform_attempt(4.02, 0.88, -0.36)
print(h1m1_coeff)
## b_attempt2 b_attempt3
## 1 1.218905 0.9265306
# For centered familiarity
fam = (4.02 + 1.17) / 4.02
print(fam)
## [1] 1.291045
# For centered BIF
bif = (4.02 + 1.21) / 4.02
print(bif)
## [1] 1.300995
So it seems that we are misleading with the priors and this will not be the way
plot(h1.m4)
# all good
plot(conditional_effects(h1.m4), points = TRUE)
# the effect of main predictor is now very moderated
pp_check(h1.m4, type = "dens_overlay")
# the problem with predicting negative values remains
pp_check(h1.m4, type = "error_scatter_avg")
# unchanged
h1.m4_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8701813 0.001472199 0.867155 0.872996
# explained variance remains around 87%
So we still see negative values in the posterior simulations, so let’s try Student’s t-distribution which is more robut to outliers and can potentially reduce the likelihood of negative values (if we reduce degrees of freedom)
priors_h1m5 <- c(
set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z"),
set_prior("normal(0.5,0.05)", class = "sd", group = "TrialNumber_c"),
set_prior("normal(0.5,0.05)", class = "sd", group = "Participant"),
set_prior("normal(0.5,0.05)", class = "sd", group = "Concept"),
set_prior("normal(1,0.05)", class = "sd"),
set_prior("normal(0.5,0.1)", class = "sigma"),
set_prior("gamma(2, 0.1)", class = "nu") # Prior for degrees of freedom
)
h1.m5 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c),
data = final_data,
prior=priors_h1m5,
family = student,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m5 <- add_criterion(h1.m5, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m5_R2 <- bayes_R2(h1.m5)
# Save both as objects
saveRDS(h1.m5, here("09_Analysis_Modeling", "models", "h1.m5.rds"))
saveRDS(h1.m5_R2, here("09_Analysis_Modeling", "models", "h1.m5_R2.rds"))
beep(5)
h1.m5 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m5.rds"))
h1.m5_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m5_R2.rds"))
# Summary
summary(h1.m5)
## Family: student
## Links: mu = identity; sigma = identity; nu = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.38 0.06 0.26 0.49 1.00 5974 5242
## sd(CommAtt2M1) 0.42 0.05 0.32 0.53 1.00 8447 6363
## sd(CommAtt3M2) 0.46 0.05 0.36 0.56 1.00 8880 6408
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.13 0.02 0.10 0.17 1.00 3617 5313
## sd(CommAtt2M1) 0.57 0.03 0.51 0.63 1.00 6105 6010
## sd(CommAtt3M2) 0.72 0.03 0.66 0.79 1.00 7296 6471
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.09 0.03 0.04 0.15 1.00 1649 2746
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.92 0.08 3.75 4.08 1.00 1736 3195
## CommAtt2M1 2.78 0.11 2.56 2.99 1.00 2927 3842
## CommAtt3M2 -3.95 0.13 -4.19 -3.70 1.00 2998 3937
## Modality1 -1.11 0.03 -1.17 -1.06 1.00 13750 6634
## Modality2 0.56 0.03 0.51 0.62 1.00 15623 6248
## Big5 1.15 0.03 1.10 1.21 1.00 10082 6655
## Familiarity 1.13 0.03 1.07 1.18 1.00 9422 6474
## Expressibility_z 0.39 0.01 0.36 0.41 1.00 16992 6093
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.52 0.01 0.50 0.55 1.00 11160 6372
## nu 2.96 0.16 2.67 3.28 1.00 13245 5813
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# the coefficients look ok again. Familiarity and Big5 even got closer to our simulated betas
plot(h1.m5)
# all good
plot(conditional_effects(h1.m5), points = TRUE)
# the effects all go in good direction
pp_check(h1.m5, type = "dens_overlay")
# the fit seems somewhat better than with gaussian, but still negative values are there
pp_check(h1.m5, type = "error_scatter_avg")
# unchanged
h1.m5_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8541097 0.002443239 0.84921 0.8587499
# explained variance remains around 85%
We have already seen - when plotting - that effort probably tends towards lognormal distribution. We will keep the model constant, just exchange the distribution. Priors are kept default
h1.m6 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c),
data = final_data,
family = lognormal(),
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m6 <- add_criterion(h1.m6, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m6_R2 <- bayes_R2(h1.m6)
# Save both as objects
saveRDS(h1.m6, here("09_Analysis_Modeling", "models", "h1.m6.rds"))
saveRDS(h1.m6_R2, here("09_Analysis_Modeling", "models", "h1.m6_R2.rds"))
beep(5)
h1.m6 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6.rds"))
h1.m6_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6_R2.rds"))
# Summary
summary(h1.m6)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt || Participant) + (1 + CommAtt || Concept) + (1 || TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.01 0.00 0.00 0.02 1.00 2298 2414
## sd(CommAtt2M1) 0.00 0.00 0.00 0.01 1.00 4803 3682
## sd(CommAtt3M2) 0.01 0.01 0.00 0.02 1.00 3687 3650
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.05 0.00 0.04 0.06 1.00 2844 4508
## sd(CommAtt2M1) 0.00 0.00 0.00 0.01 1.00 4417 3655
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 2618 2888
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3411 3578
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.23 0.01 1.22 1.24 1.00 3120 4439
## CommAtt2M1 0.62 0.01 0.61 0.63 1.00 9634 6358
## CommAtt3M2 -1.10 0.01 -1.11 -1.08 1.00 9514 5588
## Modality1 -0.31 0.01 -0.33 -0.30 1.00 9097 6088
## Modality2 0.16 0.01 0.15 0.17 1.00 9223 6512
## Big5 0.32 0.01 0.31 0.34 1.00 2924 4558
## Familiarity 0.32 0.01 0.31 0.34 1.00 2297 3545
## Expressibility_z 0.12 0.00 0.12 0.13 1.00 10923 6710
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.17 0.00 0.17 0.18 1.00 11348 5998
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
Of course now we have all on lognormal scale so it is not so straightforwadly interpretable. This is how we can get to the values of an effort under certain value of a predictor
library(emmeans)
# Compute estimated marginal means on log-scale
em_h1.m6 <- emmeans(h1.m6, ~ CommAtt) #~ CommAtt
#Backtransform the post.beta values
em_h1.m6@post.beta <- exp(em_h1.m6@post.beta)
print(em_h1.m6)
## CommAtt emmean lower.HPD upper.HPD
## 1 3.33 3.29 3.37
## 2 6.21 6.13 6.29
## 3 2.08 2.05 2.11
##
## Results are averaged over the levels of: Modality
## Point estimate displayed: median
## HPD interval probability: 0.95
So we indeed see that effort in CommAtt2 increase but then decreases again for CommAtt3. We can also try to get our simulated betas
coeff1 <- 3.33*1.5
coeff2 <- 3.33*0.5
print(coeff1)
## [1] 4.995
print(coeff2) # which is very close to the mean values we see from the model
## [1] 1.665
We see that the effect estimated by the model is now stronger than in our simulated data - also maybe because the values are averaged over predictor modality
Now let’s try a different method to get the coefficients (code adapted from https://bruno.nicenboim.me/bayescogsci/ch-reg.html#sec-trial)
# Extract posterior samples
alpha_samples <- as_draws_df(h1.m6)$b_Intercept
beta_2_vs_1 <- as_draws_df(h1.m6)$b_CommAtt2M1
beta_3_vs_2 <- as_draws_df(h1.m6)$b_CommAtt3M2
# Compute expected values on the log scale
mu_1 <- alpha_samples # CommAtt 1
mu_2 <- alpha_samples + beta_2_vs_1 # CommAtt 2
mu_3 <- alpha_samples + beta_2_vs_1 + beta_3_vs_2 # CommAtt 3
# Transform to original scale
effect_1 <- exp(mu_1)
effect_2 <- exp(mu_2)
effect_3 <- exp(mu_3)
# Calculate contrasts on the original scale
effect_diff_2_vs_1 <- effect_2 - effect_1
effect_diff_3_vs_2 <- effect_3 - effect_2
effect_diff_3_vs_1 <- effect_3 - effect_1
# Summarize the effects
list(
mean_intercept = mean(effect_1),
mean_diff_2_vs_1 = c(mean = mean(effect_diff_2_vs_1), quantile(effect_diff_2_vs_1, c(0.025, 0.975))),
mean_diff_3_vs_2 = c(mean = mean(effect_diff_3_vs_2), quantile(effect_diff_3_vs_2, c(0.025, 0.975))),
mean_diff_3_vs_1 = c(mean = mean(effect_diff_3_vs_1), quantile(effect_diff_3_vs_1, c(0.025, 0.975)))
)
## $mean_intercept
## [1] 3.419549
##
## $mean_diff_2_vs_1
## mean 2.5% 97.5%
## 2.950688 2.872812 3.029410
##
## $mean_diff_3_vs_2
## mean 2.5% 97.5%
## -4.240333 -4.321665 -4.160104
##
## $mean_diff_3_vs_1
## mean 2.5% 97.5%
## -1.289646 -1.320918 -1.257698
Now we can use these coefficients to transform to our simulated betas
h1m6_coeff <- transform_attempt(3.419549, 2.950688, -4.240333)
print(h1m6_coeff)
## b_attempt2 b_attempt3
## 1 1.862888 0.3343524
Now this looks closer again to our betas.
This is for all predictors (except concept and participant)
# Extract posterior samples
posterior_samples <- as_draws_df(h1.m6)
alpha_samples <- posterior_samples$b_Intercept
# Create a list to store effects for each fixed factor
effect_list <- list()
# Helper function to calculate summary statistics
get_effect_summary <- function(effect_samples) {
mean_effect <- mean(effect_samples)
se_effect <- sd(effect_samples)
ci_effect <- quantile(effect_samples, c(0.025, 0.975))
post_prob <- mean(effect_samples > 0)
c(mean = mean_effect,
se = se_effect,
lower_ci = ci_effect[1],
upper_ci = ci_effect[2],
post_prob = post_prob)
}
# COMMATT (successive differences coding)
if ("b_CommAtt2M1" %in% colnames(posterior_samples) & "b_CommAtt3M2" %in% colnames(posterior_samples)) {
beta_2_vs_1 <- posterior_samples$b_CommAtt2M1
beta_3_vs_2 <- posterior_samples$b_CommAtt3M2
mu_1 <- alpha_samples
mu_2 <- alpha_samples + beta_2_vs_1
mu_3 <- alpha_samples + beta_2_vs_1 + beta_3_vs_2
effect_list$CommAtt <- rbind(
"commat 2 vs 1" = get_effect_summary(exp(mu_2) - exp(mu_1)),
"commat 3 vs 2" = get_effect_summary(exp(mu_3) - exp(mu_2)),
"commat 3 vs 1" = get_effect_summary(exp(mu_3) - exp(mu_1))
)
}
# MODALITY (sum contrasts scaled by 0.5)
if ("b_Modality1" %in% colnames(posterior_samples) & "b_Modality2" %in% colnames(posterior_samples)) {
beta_mod_1 <- posterior_samples$b_Modality1
beta_mod_2 <- posterior_samples$b_Modality2
mu_mod_1 <- alpha_samples + beta_mod_1
mu_mod_2 <- alpha_samples + beta_mod_2
mu_mod_3 <- alpha_samples - beta_mod_1 - beta_mod_2
effect_list$Modality <- rbind(
"mod 1 vs 2" = get_effect_summary(exp(mu_mod_1) - exp(mu_mod_2)),
"mod 1 vs 3" = get_effect_summary(exp(mu_mod_1) - exp(mu_mod_3)),
"mod 2 vs 3" = get_effect_summary(exp(mu_mod_2) - exp(mu_mod_3))
)
}
# BIG5 (continuous)
if ("b_Big5" %in% colnames(posterior_samples)) {
beta_big5 <- posterior_samples$b_Big5
effect_list$Big5 <- get_effect_summary(exp(alpha_samples + beta_big5) - exp(alpha_samples))
}
# FAMILIARITY (continuous)
if ("b_Familiarity" %in% colnames(posterior_samples)) {
beta_fam <- posterior_samples$b_Familiarity
effect_list$Familiarity <- get_effect_summary(exp(alpha_samples + beta_fam) - exp(alpha_samples))
}
# EXPRESSIBILITY_Z (continuous)
if ("b_Expressibility_z" %in% colnames(posterior_samples)) {
beta_expr <- posterior_samples$b_Expressibility_z
effect_list$Expressibility_z <- get_effect_summary(exp(alpha_samples + beta_expr) - exp(alpha_samples))
}
# TRIAL NUMBER (centered continuous)
if ("b_TrialNumber_c" %in% colnames(posterior_samples)) {
beta_trial <- posterior_samples$b_TrialNumber_c
effect_list$TrialNumber_c <- get_effect_summary(exp(alpha_samples + beta_trial) - exp(alpha_samples))
}
# Convert to a nicely formatted data frame
effect_df <- do.call(rbind, effect_list)
# View effects
effect_df
## mean se lower_ci.2.5% upper_ci.97.5% post_prob
## commat 2 vs 1 2.95068779 0.04052085 2.87281233 3.0294096 1.00000
## commat 3 vs 2 -4.24033346 0.04140812 -4.32166524 -4.1601045 0.00000
## commat 3 vs 1 -1.28964568 0.01620638 -1.32091801 -1.2576981 0.00000
## mod 1 vs 2 -1.52303929 0.04030161 -1.60299691 -1.4435554 0.00000
## mod 1 vs 3 -1.49131528 0.04098655 -1.57288685 -1.4107477 0.00000
## mod 2 vs 3 0.03172401 0.04858764 -0.06380552 0.1252410 0.74175
## Big5 1.31309529 0.04264827 1.23007768 1.3970338 1.00000
## Familiarity 1.30972571 0.04408809 1.22420946 1.3983930 1.00000
## Expressibility_z 0.44448503 0.01197175 0.42135330 0.4684496 1.00000
So here we also see the negative effect of combined modality (mod1)
plot(h1.m6)
# all good
plot(conditional_effects(h1.m6), points = TRUE)
# the effects all go in good direction
pp_check(h1.m6, type = "dens_overlay")
# we got rid of negative predictions, and it looks very good
pp_check(h1.m6, type = "error_scatter_avg")
# very nice blob
h1.m6_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8889641 0.00137284 0.8861744 0.8915303
# explained variance increases to 88%
Since we now significantly improved the model performance, let’s try once again the correlation between slope and intercept
h1.m6c <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c),
data = final_data,
family = lognormal(),
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m6c <- add_criterion(h1.m6c, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m6c_R2 <- bayes_R2(h1.m6c)
# Save both as objects
saveRDS(h1.m6c, here("09_Analysis_Modeling", "models", "h1.m6c.rds"))
saveRDS(h1.m6c_R2, here("09_Analysis_Modeling", "models", "h1.m6c_R2.rds"))
beep(5)
h1.m6c <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6c.rds"))
h1.m6c_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m6c_R2.rds"))
# Summary
summary(h1.m6c)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.01 0.00 0.00 0.02 1.00 1744
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5613
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 3784
## cor(Intercept,CommAtt2M1) 0.28 0.49 -0.78 0.95 1.00 11362
## cor(Intercept,CommAtt3M2) 0.33 0.47 -0.75 0.95 1.00 7628
## cor(CommAtt2M1,CommAtt3M2) -0.05 0.50 -0.89 0.87 1.00 8530
## Tail_ESS
## sd(Intercept) 1448
## sd(CommAtt2M1) 5243
## sd(CommAtt3M2) 4533
## cor(Intercept,CommAtt2M1) 5945
## cor(Intercept,CommAtt3M2) 5887
## cor(CommAtt2M1,CommAtt3M2) 6522
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.05 0.00 0.04 0.06 1.00 3640
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5290
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 3474
## cor(Intercept,CommAtt2M1) 0.35 0.46 -0.73 0.95 1.00 10725
## cor(Intercept,CommAtt3M2) -0.40 0.40 -0.94 0.59 1.00 9703
## cor(CommAtt2M1,CommAtt3M2) -0.27 0.49 -0.94 0.78 1.00 5410
## Tail_ESS
## sd(Intercept) 5384
## sd(CommAtt2M1) 4470
## sd(CommAtt3M2) 4069
## cor(Intercept,CommAtt2M1) 5934
## cor(Intercept,CommAtt3M2) 5524
## cor(CommAtt2M1,CommAtt3M2) 6511
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4362 4575
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.23 0.01 1.22 1.24 1.00 5205 5979
## CommAtt2M1 0.62 0.01 0.61 0.63 1.00 14611 6124
## CommAtt3M2 -1.10 0.01 -1.11 -1.08 1.00 12219 6336
## Modality1 -0.32 0.01 -0.33 -0.30 1.00 14134 6778
## Modality2 0.16 0.01 0.15 0.18 1.00 15159 6553
## Big5 0.32 0.01 0.31 0.34 1.00 4826 6410
## Familiarity 0.32 0.01 0.30 0.34 1.00 4174 5246
## Expressibility_z 0.12 0.00 0.12 0.13 1.00 13884 6876
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.17 0.00 0.17 0.18 1.00 17858 5344
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
plot(h1.m6c)
# now correlation does not seem to generate problems
plot(conditional_effects(h1.m6c), points = TRUE)
# the effects all go in good direction
pp_check(h1.m6c, type = "dens_overlay")
# ppcheck good
pp_check(h1.m6c, type = "error_scatter_avg")
# nice blob
h1.m6c_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8890392 0.001393478 0.8861745 0.8916361
# explained variance remains around 88%
Let’s now check which model has the best diagnostics
# Extract R-hat values for each model
rhat_list <- lapply(model_list, function(model) {
rhat_values <- rhat(model)
data.frame(model = deparse(substitute(model)),
max_rhat = max(rhat_values),
min_rhat = min(rhat_values))
})
# Combine and inspect
do.call(rbind, rhat_list)
## model max_rhat min_rhat
## 1 X[[i]] 1.002349 0.9996758
## 2 X[[i]] 1.011516 0.9997215
## 3 X[[i]] 1.005103 0.9997167
## 4 X[[i]] 1.003503 0.9996573
## 5 X[[i]] 1.004119 0.9996640
## 6 X[[i]] 1.003712 0.9996386
## 7 X[[i]] 1.002459 0.9995773
## 8 X[[i]] 1.002605 0.9996426
All models seems actually ok in terms of Rhat values except model 2 (h1.m2)
Effective sample size tells how many independent samples the model has effectively drawn from the PD. Low ESS suggests autocorrelation (i.e., sample explores one part of posterior), while high ESS means good mix
# Extract n_eff values for each model
neff_ratio_list <- lapply(model_list, function(model) {
neff_values <- neff_ratio(model) # Here we calculate ratio (not the raw number of effective samples)
data.frame(model = deparse(substitute(model)),
min_neff = min(neff_values),
max_neff = max(neff_values),
mean_neff = mean(neff_values))
})
# Combine and inspect
do.call(rbind, neff_ratio_list)
## model min_neff max_neff mean_neff
## 1 X[[i]] 0.23283181 0.8832823 0.6862927
## 2 X[[i]] 0.04396011 0.9020581 0.6204512
## 3 X[[i]] 0.14481431 0.8834718 0.7504325
## 4 X[[i]] 0.22967606 0.9020381 0.7709286
## 5 X[[i]] 0.11769241 0.9065446 0.7505601
## 6 X[[i]] 0.20616401 0.9322803 0.7457049
## 7 X[[i]] 0.28072060 0.8984722 0.7626900
## 8 X[[i]] 0.18101711 0.9370772 0.7924409
So the highest ratio have model h1.m6c (lognormal with correlation) but in fact they are all quite comparable. Let’s loot at 3 highest
effective_sample(h1.m6c) # this one indeed seems the best as it has all ESS around 10k
## Parameter ESS
## 1 b_Intercept 5193
## 2 b_CommAtt2M1 14438
## 3 b_CommAtt3M2 12340
## 4 b_Modality1 14518
## 5 b_Modality2 14927
## 6 b_Big5 4824
## 7 b_Familiarity 4162
## 8 b_Expressibility_z 14233
effective_sample(h1.m6)
## Parameter ESS
## 1 b_Intercept 3104
## 2 b_CommAtt2M1 9618
## 3 b_CommAtt3M2 9509
## 4 b_Modality1 9057
## 5 b_Modality2 9192
## 6 b_Big5 2906
## 7 b_Familiarity 2280
## 8 b_Expressibility_z 10811
effective_sample(h1.m3p)
## Parameter ESS
## 1 b_Intercept 8533
## 2 b_CommAtt2M1 2795
## 3 b_CommAtt3M2 2385
## 4 b_Modality1 13744
## 5 b_Modality2 13144
## 6 b_Big5 12483
## 7 b_Familiarity 13139
## 8 b_Expressibility_z 15614
So there are some considerable differences for different coefficients
Leave-one-out (loo) validation
l <- loo_compare(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, criterion = "loo")
print(l, simplify = F)
## elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
## h1.m6c 0.0 0.0 -5070.5 69.3 126.5 3.5 10141.0 138.5
## h1.m6 -0.2 1.5 -5070.7 69.2 120.4 3.3 10141.4 138.5
## h1.m5 -904.2 59.2 -5974.6 74.1 418.5 5.3 11949.3 148.1
## h1.m2 -1143.5 67.5 -6214.0 78.4 285.6 10.7 12428.0 156.8
## h1.m3 -1221.1 66.5 -6291.6 75.5 261.3 7.7 12583.2 150.9
## h1.m3p -1222.5 66.9 -6293.0 76.3 281.6 8.3 12586.0 152.5
## h1.m4 -1224.7 67.4 -6295.2 77.0 281.3 8.4 12590.3 154.0
## h1.m1 -1681.0 72.2 -6751.5 74.9 25.3 0.8 13503.0 149.9
elpd_loo: This is the expected log pointwise predictive density for LOO. Higher values indicate a better fit to the data.
se_elpd_loo: The standard error of the elpd_loo, representing uncertainty in the model’s predictive fit according to LOO.
looic: The LOO Information Criterion, which is similar to waic but based on leave-one-out cross-validation. Lower values are better.
p_loo: The effective number of parameters according to LOO, indicating the model’s complexity.
se_p_loo: The standard error of p_loo, representing uncertainty around the effective number of parameters.
So lognormal seems the best.
Information criterion (WAIC)
w <- loo_compare(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, criterion = "waic")
print(w, simplify = F)
## elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic
## h1.m6c 0.0 0.0 -5070.3 69.3 126.3 3.5 10140.5
## h1.m6 -0.3 1.5 -5070.5 69.2 120.2 3.3 10141.0
## h1.m5 -903.7 59.2 -5974.0 74.1 417.8 5.3 11948.0
## h1.m2 -1142.2 67.5 -6212.5 78.3 284.1 10.6 12424.9
## h1.m3 -1220.4 66.5 -6290.7 75.4 260.3 7.6 12581.3
## h1.m3p -1221.6 66.9 -6291.8 76.2 280.4 8.2 12583.7
## h1.m4 -1223.8 67.3 -6294.1 77.0 280.3 8.4 12588.2
## h1.m1 -1681.2 72.2 -6751.5 74.9 25.3 0.8 13502.9
## se_waic
## h1.m6c 138.5
## h1.m6 138.5
## h1.m5 148.1
## h1.m2 156.6
## h1.m3 150.9
## h1.m3p 152.5
## h1.m4 153.9
## h1.m1 149.9
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
se = w[,2] * 2)
## waic_diff se
## h1.m6c 0.0000000 0.000000
## h1.m6 0.5289111 2.917133
## h1.m5 1807.4580148 118.471557
## h1.m2 2284.3843630 134.987550
## h1.m3 2440.8062251 132.903242
## h1.m3p 2443.1522534 133.864984
## h1.m4 2447.6767321 134.678147
## h1.m1 3362.3866702 144.475730
elpd_waic (expected log pointwise predictive density for WAIC): This represents the model’s predictive fit to the data. Higher values indicate a better fit.
se_elpd_waic (standard error of elpd_waic): Measures uncertainty around the elpd_waic estimate.
waic: The Widely Applicable Information Criterion, a measure of model fit where lower values indicate a better fit.
se_waic (standard error of WAIC): Uncertainty around the WAIC estimate.
elpd_diff: The difference in the elpd_waic between the model in question and the baseline model (fit_eff_2, which has elpd_diff of 0). A negative value indicates that the model fits worse than fit_eff_2.
se_diff: The standard error of the elpd_diff, indicating how much uncertainty there is in the difference in predictive performance.
p_waic: The number of effective parameters in the model (related to model complexity). Lower values indicate simpler models, and higher values suggest more complexity.
Plot the comparison
model_weights(h1.m1, h1.m2, h1.m3, h1.m3p, h1.m4, h1.m5, h1.m6, h1.m6c, weights = "waic") %>%
round(digits = 2)
## h1.m1 h1.m2 h1.m3 h1.m3p h1.m4 h1.m5 h1.m6 h1.m6c
## 0.00 0.00 0.00 0.00 0.00 0.00 0.43 0.57
So as ppcheck already suggested, lognormal model indeed seem to have the most predictive power. For this particular (synthetic) data, we will now proceed with model h1.m6c
We will first add some mildly informative priors, and then we also try to add some interaction terms and do a comparison once again.
Let’s first check what priors have been selected as default for h1.m6c
# Print priors
prior_summary(h1.m6c)
## prior class coef group resp dpar
## (flat) b
## (flat) b Big5
## (flat) b CommAtt2M1
## (flat) b CommAtt3M2
## (flat) b Expressibility_z
## (flat) b Familiarity
## (flat) b Modality1
## (flat) b Modality2
## student_t(3, 1.3, 2.5) Intercept
## lkj_corr_cholesky(1) L
## lkj_corr_cholesky(1) L Concept
## lkj_corr_cholesky(1) L Participant
## student_t(3, 0, 2.5) sd
## student_t(3, 0, 2.5) sd Concept
## student_t(3, 0, 2.5) sd CommAtt2M1 Concept
## student_t(3, 0, 2.5) sd CommAtt3M2 Concept
## student_t(3, 0, 2.5) sd Intercept Concept
## student_t(3, 0, 2.5) sd Participant
## student_t(3, 0, 2.5) sd CommAtt2M1 Participant
## student_t(3, 0, 2.5) sd CommAtt3M2 Participant
## student_t(3, 0, 2.5) sd Intercept Participant
## student_t(3, 0, 2.5) sd TrialNumber_c
## student_t(3, 0, 2.5) sd Intercept TrialNumber_c
## student_t(3, 0, 2.5) sigma
## nlpar lb ub source
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## default
## default
## (vectorized)
## (vectorized)
## 0 default
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 default
Ok, we can keep all defaulted ones, but we do not need to leave flat priors for the beta coefficients as we do have some assumptions/expectations
Let’s reuse priors we have already used for h1.m3p
priors_h1m7 <- c(
set_prior("normal(2.5, 0.5)", class = "Intercept", lb=0),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt3M2"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z")
)
# The rest we will leave default (and check afterwards)
h1.m7 <- brm(Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c),
data = final_data,
family = lognormal(),
prior = priors_h1m7,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m7 <- add_criterion(h1.m7, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m7_R2 <- bayes_R2(h1.m7)
# Save both as objects
saveRDS(h1.m7, here("09_Analysis_Modeling", "models", "h1.m7.rds"))
saveRDS(h1.m7_R2, here("09_Analysis_Modeling", "models", "h1.m7_R2.rds"))
beep(5)
h1.m7 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m7.rds"))
h1.m7_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m7_R2.rds"))
# Summary
summary(h1.m7)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.01 0.00 0.00 0.02 1.00 2406
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5336
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 3663
## cor(Intercept,CommAtt2M1) 0.27 0.49 -0.79 0.95 1.00 9049
## cor(Intercept,CommAtt3M2) 0.34 0.47 -0.74 0.95 1.00 7153
## cor(CommAtt2M1,CommAtt3M2) -0.05 0.50 -0.89 0.88 1.00 8030
## Tail_ESS
## sd(Intercept) 2582
## sd(CommAtt2M1) 4860
## sd(CommAtt3M2) 4212
## cor(Intercept,CommAtt2M1) 5794
## cor(Intercept,CommAtt3M2) 6108
## cor(CommAtt2M1,CommAtt3M2) 5937
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.05 0.00 0.04 0.06 1.00 3562
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5277
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 3118
## cor(Intercept,CommAtt2M1) 0.34 0.46 -0.72 0.95 1.00 11493
## cor(Intercept,CommAtt3M2) -0.40 0.40 -0.94 0.57 1.00 7854
## cor(CommAtt2M1,CommAtt3M2) -0.26 0.48 -0.94 0.77 1.00 5771
## Tail_ESS
## sd(Intercept) 5189
## sd(CommAtt2M1) 4743
## sd(CommAtt3M2) 3237
## cor(Intercept,CommAtt2M1) 6204
## cor(Intercept,CommAtt3M2) 5447
## cor(CommAtt2M1,CommAtt3M2) 6837
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4640 4109
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.23 0.01 1.22 1.24 1.00 5166 5351
## CommAtt2M1 0.62 0.01 0.61 0.63 1.00 12440 6165
## CommAtt3M2 -1.10 0.01 -1.11 -1.08 1.00 10543 6212
## Modality1 -0.31 0.01 -0.33 -0.30 1.00 13106 5622
## Modality2 0.16 0.01 0.15 0.17 1.00 13551 6493
## Big5 0.32 0.01 0.31 0.34 1.00 4012 4552
## Familiarity 0.32 0.01 0.30 0.34 1.00 4337 5552
## Expressibility_z 0.12 0.00 0.12 0.13 1.00 11948 6505
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.17 0.00 0.17 0.18 1.00 15087 5361
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
plot(h1.m7)
# all good
plot(conditional_effects(h1.m7), points = TRUE)
# the effects all go in good direction
pp_check(h1.m7, type = "dens_overlay")
# nice
pp_check(h1.m7, type = "error_scatter_avg")
# unchanged
h1.m7_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8889672 0.001399296 0.8861732 0.8916425
# explained variance remains around 88%
Let’s now also check whether the priors make sense
Possible interactions include:
Note: Interesting, it would help disentagle the benefit of multimodality over other - maybe more effort overal but less attempts. However, we already model effect of modality on effort, so maybe this is not top priority
Note: Not priority I would say
Note: Not priority (especially since expressibiliy has already modality encoded)
Note: Does not seem to be priority
Note: Not priority, but it is interesting if we want to tap more into the interindividual variability
Let’s try CommAtt x Modality and Big5 x CommAtt
h1.m8 <- brm(Eff ~ 1 + CommAtt * Modality + Big5 * CommAtt + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c),
data = final_data,
family = lognormal(),
prior = priors_h1m7, # we keep the priors from previous model
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h1.m8 <- add_criterion(h1.m8, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h1.m8_R2 <- bayes_R2(h1.m8)
# Save both as objects
saveRDS(h1.m8, here("09_Analysis_Modeling", "models", "h1.m8.rds"))
saveRDS(h1.m8_R2, here("09_Analysis_Modeling", "models", "h1.m8_R2.rds"))
beep(5)
h1.m8 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m8.rds"))
h1.m8_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h1.m8_R2.rds"))
# Summary
summary(h1.m8)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Eff ~ 1 + CommAtt * Modality + Big5 * CommAtt + Familiarity + Expressibility_z + (1 + CommAtt | Participant) + (1 + CommAtt | Concept) + (1 | TrialNumber_c)
## Data: final_data (Number of observations: 5076)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.01 0.00 0.00 0.02 1.00 2549
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5836
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 4322
## cor(Intercept,CommAtt2M1) 0.29 0.48 -0.76 0.94 1.00 10754
## cor(Intercept,CommAtt3M2) 0.36 0.46 -0.70 0.95 1.00 8535
## cor(CommAtt2M1,CommAtt3M2) -0.04 0.50 -0.89 0.86 1.00 7971
## Tail_ESS
## sd(Intercept) 2905
## sd(CommAtt2M1) 4937
## sd(CommAtt3M2) 4495
## cor(Intercept,CommAtt2M1) 5912
## cor(Intercept,CommAtt3M2) 6615
## cor(CommAtt2M1,CommAtt3M2) 6944
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.05 0.00 0.04 0.06 1.00 3682
## sd(CommAtt2M1) 0.01 0.00 0.00 0.02 1.00 5068
## sd(CommAtt3M2) 0.01 0.01 0.00 0.03 1.00 3392
## cor(Intercept,CommAtt2M1) 0.34 0.45 -0.71 0.94 1.00 12364
## cor(Intercept,CommAtt3M2) -0.41 0.39 -0.94 0.57 1.00 9996
## cor(CommAtt2M1,CommAtt3M2) -0.26 0.49 -0.95 0.78 1.00 5862
## Tail_ESS
## sd(Intercept) 5433
## sd(CommAtt2M1) 5479
## sd(CommAtt3M2) 3982
## cor(Intercept,CommAtt2M1) 5945
## cor(Intercept,CommAtt3M2) 6038
## cor(CommAtt2M1,CommAtt3M2) 6673
##
## ~TrialNumber_c (Number of levels: 50)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4900 4106
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 1.23 0.01 1.22 1.24 1.00 5217
## CommAtt2M1 0.62 0.01 0.61 0.63 1.00 14494
## CommAtt3M2 -1.10 0.01 -1.11 -1.08 1.00 13959
## Modality1 -0.31 0.01 -0.33 -0.30 1.00 12978
## Modality2 0.16 0.01 0.15 0.18 1.00 13925
## Big5 0.32 0.01 0.30 0.34 1.00 4506
## Familiarity 0.32 0.01 0.30 0.34 1.00 3885
## Expressibility_z 0.12 0.00 0.12 0.13 1.00 12340
## CommAtt2M1:Modality1 0.01 0.02 -0.02 0.04 1.00 13065
## CommAtt3M2:Modality1 0.01 0.02 -0.03 0.05 1.00 12851
## CommAtt2M1:Modality2 -0.01 0.02 -0.04 0.02 1.00 13667
## CommAtt3M2:Modality2 0.02 0.02 -0.02 0.06 1.00 12109
## CommAtt2M1:Big5 -0.00 0.01 -0.02 0.02 1.00 14725
## CommAtt3M2:Big5 -0.01 0.01 -0.04 0.01 1.00 14109
## Tail_ESS
## Intercept 6099
## CommAtt2M1 7032
## CommAtt3M2 6248
## Modality1 6662
## Modality2 6627
## Big5 5386
## Familiarity 5579
## Expressibility_z 6403
## CommAtt2M1:Modality1 6974
## CommAtt3M2:Modality1 6884
## CommAtt2M1:Modality2 6960
## CommAtt3M2:Modality2 6404
## CommAtt2M1:Big5 6229
## CommAtt3M2:Big5 6504
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.17 0.00 0.17 0.18 1.00 17574 5831
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Coefficients remain mostly unchanged
# Since we did not really focused on the interactions during the simulation, we also don't have strong expectations here. But for the real data, there is a good reason to expect some meaningful values
plot(h1.m8)
# all good
plot(conditional_effects(h1.m8), points = TRUE)
# the effects all go in good direction
# we can see here that combined modality remains moderated across all commatts
# and also big5 seems to matter the most in the second attempt
pp_check(h1.m8, type = "dens_overlay")
# the problem with predicting negative values remains
pp_check(h1.m8, type = "error_scatter_avg")
# unchanged
h1.m8_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.8889759 0.001459646 0.8859485 0.8916926
# explained variance remains around 87%
effective_sample(h1.m6)
## Parameter ESS
## 1 b_Intercept 3104
## 2 b_CommAtt2M1 9618
## 3 b_CommAtt3M2 9509
## 4 b_Modality1 9057
## 5 b_Modality2 9192
## 6 b_Big5 2906
## 7 b_Familiarity 2280
## 8 b_Expressibility_z 10811
effective_sample(h1.m6c)
## Parameter ESS
## 1 b_Intercept 5193
## 2 b_CommAtt2M1 14438
## 3 b_CommAtt3M2 12340
## 4 b_Modality1 14518
## 5 b_Modality2 14927
## 6 b_Big5 4824
## 7 b_Familiarity 4162
## 8 b_Expressibility_z 14233
effective_sample(h1.m7)
## Parameter ESS
## 1 b_Intercept 5149
## 2 b_CommAtt2M1 12363
## 3 b_CommAtt3M2 10483
## 4 b_Modality1 13095
## 5 b_Modality2 13808
## 6 b_Big5 3976
## 7 b_Familiarity 4298
## 8 b_Expressibility_z 11917
effective_sample(h1.m8)
## Parameter ESS
## 1 b_Intercept 5221
## 2 b_CommAtt2M1 14165
## 3 b_CommAtt3M2 13961
## 4 b_Modality1 12872
## 5 b_Modality2 13799
## 6 b_Big5 4484
## 7 b_Familiarity 3878
## 8 b_Expressibility_z 12296
## 9 b_CommAtt2M1:Modality1 13103
## 10 b_CommAtt3M2:Modality1 12880
## 11 b_CommAtt2M1:Modality2 13651
## 12 b_CommAtt3M2:Modality2 12303
## 13 b_CommAtt2M1:Big5 15040
## 14 b_CommAtt3M2:Big5 13963
# Now h1m6 looks as the weakest, while h1m8 looks much better - but ESS for Intercept is for some reason still quite low
Add the criteria to each model
l <- loo_compare(h1.m6, h1.m6c, h1.m7, h1.m8, criterion = "loo")
print(l, simplify = F)
## elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
## h1.m6c 0.0 0.0 -5070.5 69.3 126.5 3.5 10141.0 138.5
## h1.m6 -0.2 1.5 -5070.7 69.2 120.4 3.3 10141.4 138.5
## h1.m7 -0.3 0.2 -5070.8 69.3 126.8 3.5 10141.5 138.6
## h1.m8 -4.4 1.9 -5074.8 69.2 132.9 3.7 10149.7 138.5
w <- loo_compare(h1.m6, h1.m6c, h1.m7, h1.m8, criterion = "waic")
print(w, simplify = F)
## elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic
## h1.m6c 0.0 0.0 -5070.3 69.3 126.3 3.5 10140.5
## h1.m6 -0.3 1.5 -5070.5 69.2 120.2 3.3 10141.0
## h1.m7 -0.3 0.2 -5070.5 69.3 126.5 3.5 10141.1
## h1.m8 -4.4 1.9 -5074.6 69.2 132.7 3.7 10149.2
## se_waic
## h1.m6c 138.5
## h1.m6 138.5
## h1.m7 138.6
## h1.m8 138.5
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
se = w[,2] * 2)
## waic_diff se
## h1.m6c 0.0000000 0.0000000
## h1.m6 0.5289111 2.9171328
## h1.m7 0.5582963 0.3803033
## h1.m8 8.7265395 3.7869214
So in terms of WAIC&LOO, interactions do not really add predictive power. This might be specific for the synthetic data, as we did not explicitly focused on the interaction coefficients there. At the same time, the difference of h1.m8 from h1.m6c is not so significant. For now, we stop here and with the real data, we will proceed with the same evaluation to pick up the best model
Now we can need to account also for our second hypothesis, namely
H2: A higher degree of misunderstanding will require a performer to engage in more effortful correction.
The difficulty with our data structure is that PrevAn is a variable that has values only for second and third correction (i.e., there is no previous answer for the first performance). If we left it unchange, the model will eventually get rid of all NA values, which means we will loose the relationship between effort in second and first communicative attempt. To avoid this, we will create a new variable which we call Effort Change Ratio. We will simply calculate the ratio of change from effort in communicative attempt x to communicative attempt x+1. Like that, we will still get rid of communicative attempt 1, but the ratio that will be associated with CommAtt==2 already encapsulates the relationship towards this attempt.
# Calculate Effort Change Ratio
final_data_2 <- final_data %>%
group_by(Participant, Concept) %>%
mutate(
Effort_1 = Eff[CommAtt == 1][1], # Effort for attempt 1
Effort_2 = Eff[CommAtt == 2][1], # Effort for attempt 2
Effort_3 = Eff[CommAtt == 3][1], # Effort for attempt 3
Effort_Change_Ratio_1_to_2 = case_when(
CommAtt == 2 & !is.na(Effort_1) ~ Eff / Effort_1,
TRUE ~ NA_real_
),
Effort_Change_Ratio_2_to_3 = case_when(
CommAtt == 3 & !is.na(Effort_2) ~ Eff / Effort_2,
TRUE ~ NA_real_
)
) %>%
ungroup()
# Collide ratios into single col
final_data_2 <- final_data_2 %>%
mutate(
Effort_Change_Ratio = coalesce(Effort_Change_Ratio_1_to_2, Effort_Change_Ratio_2_to_3)
)
# Remove unnecessary columns
final_data_2 <- subset(final_data_2, select = -c(Effort_1, Effort_2, Effort_3, Effort_Change_Ratio_1_to_2, Effort_Change_Ratio_2_to_3))
# View the result
head(final_data_2)
## # A tibble: 6 × 11
## Participant Concept Modality Big5 Familiarity Expressibility CommAtt Eff
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 combined 1.92 1.84 0.576 1 4.34
## 2 1 1 combined 1.92 1.84 0.576 2 6.57
## 3 1 1 combined 1.92 1.84 0.576 3 2.37
## 4 1 2 gesture 1.92 1.84 0.349 1 5.50
## 5 1 2 gesture 1.92 1.84 0.349 2 10.4
## 6 1 2 gesture 1.92 1.84 0.349 3 4.00
## # ℹ 3 more variables: TrialNumber <dbl>, PrevAn <dbl>,
## # Effort_Change_Ratio <dbl>
Let’s check in plots
So we will have to work with bimodal distribution, as we either have a
decrease in effort (Ratio < 1) or increase (Ratio > 1)
#H2: Stating causal model
We now also need a new DAG. Essentially, what we said will influence CommAtt in H1, will now also influence PrevAn because they are tightly related. For instance, more extroverted people can be expected to be better guessers, therefore the similarity of the previous answer will be higher.
## { Big5, Conc, Expr, Fam, Mod, Pcn, TrNum }
The adjustment set is identical to the one of H1. Note that we are here omitting arrows going from all these variables to ComAtt. However, since PrevAn affects CommAtt, and not the other way, we anyway do not need to block this path to avoid confounds. However, if we want to asses direct effect of PrevAn on EffChange, we have to add CommAtt to the model as well
H2: A higher degree of misunderstanding will require a performer to engage in more effortful correction.
filtered_data$CommAtt <- as.factor(filtered_data$CommAtt)
filtered_data$Modality <- as.factor(filtered_data$Modality)
filtered_data$Participant <- as.factor(filtered_data$Participant)
filtered_data$Concept <- as.factor(filtered_data$Concept)
filtered_data$TrialNumber <- as.numeric(filtered_data$TrialNumber) # Ensure TrialNumber is numeric
contrasts(filtered_data$CommAtt) <- MASS::contr.sdif(2)
contrasts(filtered_data$Modality) <- contr.sum(3)/2
filtered_data$TrialNumber_c <- filtered_data$TrialNumber - median(range(filtered_data$TrialNumber))
filtered_data$Familiarity <- filtered_data$Familiarity - median(range(filtered_data$Familiarity))
filtered_data$Big5 <- filtered_data$Big5 - median(range(filtered_data$Big5))
filtered_data <-
filtered_data |>
group_by(Modality) |>
mutate(Expressibility_z = (Expressibility - mean(Expressibility))/ sd(filtered_data$Expressibility, na.rm = T)) |>
ungroup()
filtered_data <-
filtered_data |>
mutate(PrevAn_z = (PrevAn - mean(PrevAn))/ sd(filtered_data$PrevAn, na.rm = T)) |>
ungroup()
Note: maybe we should center where we z-score to get better sense of the predictors (right now, for me it's still a bit uninterpretable)
h2.m1 <- brm(Effort_Change_Ratio ~ 1 + PrevAn_z + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept),
data = filtered_data,
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h2.m1 <- add_criterion(h2.m1, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.m1_R2 <- bayes_R2(h2.m1)
# Save both as objects
saveRDS(h2.m1, here("09_Analysis_Modeling", "models", "h2.m1.rds"))
saveRDS(h2.m1_R2, here("09_Analysis_Modeling", "models", "h2.m1_R2.rds"))
beep(5)
h2.m1 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m1.rds"))
h2.m1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m1_R2.rds"))
# Summary
summary(h2.m1)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + PrevAn_z + CommAtt + Familiarity + Big5 + Expressibility_z + TrialNumber_c + Modality + (1 | Participant) + (1 | Concept)
## Data: filtered_data (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3361 3453
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3037 3838
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.11 0.00 1.10 1.11 1.00 12420 6655
## PrevAn_z -0.16 0.00 -0.16 -0.15 1.00 16015 5424
## CommAtt2M1 -1.54 0.00 -1.55 -1.53 1.00 21480 4412
## Familiarity 0.00 0.00 -0.00 0.01 1.00 18103 5539
## Big5 -0.00 0.00 -0.01 0.01 1.00 17827 6080
## Expressibility_z 0.00 0.00 -0.00 0.01 1.00 15822 6089
## TrialNumber_c 0.00 0.00 -0.00 0.00 1.00 7961 6172
## Modality1 -0.00 0.00 -0.01 0.01 1.00 14413 6436
## Modality2 -0.01 0.01 -0.02 0.00 1.00 13936 6092
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.09 0.00 0.09 0.09 1.00 15108 5752
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# coefficients seem quite conservative, even for PrevAn (but remember, it's z-scored)
plot(h2.m1)
# all looks ok
plot(conditional_effects(h2.m1), points = TRUE)
# the main effect seems ok, the rest of the predictors does not show nothing (maybe because of the transformation we lost the relations between them and the effort)
pp_check(h2.m1, type = "dens_overlay")
# this looks quite okay-ish
pp_check(h2.m1, type = "error_scatter_avg")
# so there seems to be quite high residual error for some extreme values (esp for the second mode)
h2.m1_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9860988 7.752928e-05 0.9859373 0.9862448
# explained variance around 98%
We will once again try lognormal distribution, for now leaving priors to default
h2.m2 <- brm(Effort_Change_Ratio ~ 1 + PrevAn_z + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + PrevAn || Participant) + (1 + PrevAn || Concept) + (1 | TrialNumber_c),
data = filtered_data,
family = lognormal(),
iter = 4000,
cores = 4)
# Add criterions for later diagnostics
h2.m2 <- add_criterion(h2.m2, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.m2_R2 <- bayes_R2(h2.m2)
# Save both as objects
saveRDS(h2.m2, here("09_Analysis_Modeling", "models", "h2.m2.rds"))
saveRDS(h2.m2_R2, here("09_Analysis_Modeling", "models", "h2.m2_R2.rds"))
beep(5)
h2.m2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m2.rds"))
h2.m2_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.m2_R2.rds"))
# Summary
summary(h2.m2)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + PrevAn_z + CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + PrevAn || Participant) + (1 + PrevAn || Concept) + (1 | TrialNumber_c)
## Data: filtered_data (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4108 4460
## sd(PrevAn) 0.00 0.00 0.00 0.01 1.00 5318 4098
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 2293 4226
## sd(PrevAn) 0.00 0.00 0.00 0.01 1.00 3211 4384
##
## ~TrialNumber_c (Number of levels: 49)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4800 4380
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.24 0.00 -0.24 -0.23 1.00 10195 6482
## PrevAn_z -0.12 0.00 -0.12 -0.12 1.00 7996 5075
## CommAtt2M1 -1.72 0.00 -1.73 -1.72 1.00 18165 5973
## Modality1 -0.00 0.00 -0.01 0.01 1.00 15611 5899
## Modality2 -0.00 0.00 -0.01 0.01 1.00 14981 6417
## Big5 -0.00 0.00 -0.01 0.00 1.00 13219 6273
## Familiarity -0.00 0.00 -0.01 0.00 1.00 13664 6321
## Expressibility_z 0.00 0.00 -0.00 0.00 1.00 9934 6219
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.07 0.00 0.07 0.07 1.00 17435 5346
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# Extract posterior samples
posterior_samples <- as_draws_df(h2.m2)
alpha_samples <- posterior_samples$b_Intercept
# Create a list to store effects for each fixed factor
effect_list <- list()
# Helper function to calculate summary statistics
get_effect_summary <- function(effect_samples) {
mean_effect <- mean(effect_samples)
se_effect <- sd(effect_samples)
ci_effect <- quantile(effect_samples, c(0.025, 0.975))
post_prob <- mean(effect_samples > 0)
c(mean = mean_effect,
se = se_effect,
lower_ci = ci_effect[1],
upper_ci = ci_effect[2],
post_prob = post_prob)
}
# Compute expected values on the log scale
mu_1 <- alpha_samples # CommAtt 1
# Transform to original scale
effect_1 <- exp(mu_1)
# PrevAn (continuous)
if ("b_PrevAn_z" %in% colnames(posterior_samples)) {
beta_prevAn <- posterior_samples$b_PrevAn_z
effect_list$PrevAn_z <- get_effect_summary(exp(alpha_samples + beta_prevAn) - exp(alpha_samples))
}
effect_list$Intercept_mean <- mean(effect_1)
# Convert to a nicely formatted data frame
effect_df <- do.call(rbind, effect_list)
# View effects
effect_df
## mean se lower_ci.2.5% upper_ci.97.5% post_prob
## PrevAn_z -0.08764297 0.0009697511 -0.08953147 -0.0857059 0.0000000
## Intercept_mean 0.78830407 0.7883040707 0.78830407 0.7883041 0.7883041
Still difficult to interpret as PrevAn is z-scored
plot(h2.m2)
# looks good
plot(conditional_effects(h2.m2), points = TRUE)
# main predictor looks ok, but the rest still moderated
pp_check(h2.m2, type = "dens_overlay")
# this looks very nice
pp_check(h2.m2, type = "error_scatter_avg")
# so higher values are kind blobby with no particular trend, but the lower values are still quite errorneous
h2.m2_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9985866 0.0001091032 0.9983392 0.9987595
# explained variance 99%
There are some values we should be cautious about - the relationship between predicted values and residual error is still quite correlated for lower values, and the explained variance would in real-data case probably hint to over-fitting. However, we created synthetic data that does not contain as much noise as we can expect from the real data, so it is quite likely that if we simply used all predictors we know to cause a variance here, we might be explaining most of the variation. It is quite likely this will not be the case for real data.
At this point, we could be essentially content with the last model. However, as both outcome variable and the main predictor are continuous variable, we might expect nonlinear patterns in their relationship. For instance, it might be the case that for answers that are completely off, people do not feel motivated to put in more effort, and sum kind of sensible ‘threshold’ needs to be reached for the effort to increase to resolve misunderstanding, and then again decrease when the answer is very similar.
We will use a combinaiton of B-splines and Bayesian GAMs to prepare some models and try them fit to our data - even if we currently should not expect any non-linearity as we have not coded any, as we can see in the following plot
What b-splines do is they build up wiggly functions from more simple
components which are called ‘basis functions’ (B). What they do is they
divide a full range of predictor into parts and assign a parameter to
each part. The parameters are gradually turned on and off in a way that
makes their sum into a curve. Unlike polynomial regression, b-splines do
not transform predictor, but they invent a series of new synthetic
predictor variables. Each of them then exists only to gradually turn a
specific parameter on and off within a specific range of the real
predictor variable. Each of these synthetic variables is called a basis
function B. See more in Richard McElreath’s Statistical Rethinking.
We are using the code from Solomon Kurz’s adaptation of this book (https://bookdown.org/content/4857/geocentric-models.html)
# Get rid of NAs in the predictor
d <-
final_data_2 %>%
drop_na(PrevAn)
# And convert all that is necessary to factor/numerical
d$CommAtt <- as.factor(d$CommAtt)
d$Modality <- as.factor(d$Modality)
d$Participant <- as.factor(d$Participant)
d$Concept <- as.factor(d$Concept)
d$TrialNumber <- as.numeric(d$TrialNumber)
Here we can see summary of each predictor
d %>%
select_if(is.numeric) %>% # Select only numeric columns
pivot_longer(cols = everything(), names_to = "key", values_to = "value") %>%
group_by(key) %>%
summarise(
mean = mean(value, na.rm = TRUE),
sd = sd(value, na.rm = TRUE),
ll = quantile(value, probs = 0.055, na.rm = TRUE),
ul = quantile(value, probs = 0.945, na.rm = TRUE)
) %>%
mutate(across(where(is.double), round, digits = 2))
## # A tibble: 7 × 5
## key mean sd ll ul
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Big5 0.98 0.57 0.09 1.93
## 2 Eff 5.1 2.79 1.43 9.79
## 3 Effort_Change_Ratio 1.35 0.75 0.28 2.19
## 4 Expressibility 0.47 0.38 0.07 1.32
## 5 Familiarity 1.13 0.55 0.15 1.9
## 6 PrevAn 0.5 0.29 0.05 0.94
## 7 TrialNumber 22.6 12.4 3 42
print(d)
## # A tibble: 2,556 × 11
## Participant Concept Modality Big5 Familiarity Expressibility CommAtt Eff
## <fct> <fct> <fct> <dbl> <dbl> <dbl> <fct> <dbl>
## 1 1 1 combined 1.92 1.84 0.576 2 6.57
## 2 1 1 combined 1.92 1.84 0.576 3 2.37
## 3 1 2 gesture 1.92 1.84 0.349 2 10.4
## 4 1 2 gesture 1.92 1.84 0.349 3 4.00
## 5 1 3 vocal 1.92 1.84 0.340 2 12.9
## 6 1 3 vocal 1.92 1.84 0.340 3 3.59
## 7 1 4 combined 1.92 1.84 0.719 2 6.41
## 8 1 6 gesture 1.92 1.84 0.00419 2 12.0
## 9 1 6 gesture 1.92 1.84 0.00419 3 3.44
## 10 1 8 vocal 1.92 1.84 0.591 2 9.11
## # ℹ 2,546 more rows
## # ℹ 3 more variables: TrialNumber <dbl>, PrevAn <dbl>,
## # Effort_Change_Ratio <dbl>
Now we need to specify knots that function as pivots for number of different basis functions. The B variable then tells you which knot you are close to.
num_knots <- 7
knot_list <- quantile(d$PrevAn, probs = seq(from = 0, to = 1, length.out = num_knots))
knot_list
## 0% 16.66667% 33.33333% 50% 66.66667% 83.33333%
## 0.0001020494 0.1540585973 0.3319149296 0.5038721841 0.6719870094 0.8383128390
## 100%
## 0.9989464423
Now we can see how we chopped the data by the knots
Now we need to specify the polynomial degree which determines how
parameters interact to produce the spline.
For degree 1, two basis functions combine at each point. For degree 2, three functions combine at each point. For degree 3, four combine.
library(splines)
B <- bs(d$PrevAn,
knots = knot_list[-c(1, num_knots)],
degree = 3, # cubic spline
intercept = TRUE)
This is how cubic spline with 7 knots look like
We will now add this B matrix into the data to be able to further use it
in the models
d2 <-
d %>%
mutate(B = B)
# take a look at the structure of `d3
d2 %>% glimpse()
## Rows: 2,556
## Columns: 12
## $ Participant <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ Concept <fct> 1, 1, 2, 2, 3, 3, 4, 6, 6, 8, 8, 9, 9, 14, 15, 15,…
## $ Modality <fct> combined, combined, gesture, gesture, vocal, vocal…
## $ Big5 <dbl> 1.919899, 1.919899, 1.919899, 1.919899, 1.919899, …
## $ Familiarity <dbl> 1.8439059, 1.8439059, 1.8439059, 1.8439059, 1.8439…
## $ Expressibility <dbl> 0.576172309, 0.576172309, 0.349378655, 0.349378655…
## $ CommAtt <fct> 2, 3, 2, 3, 2, 3, 2, 2, 3, 2, 3, 2, 3, 2, 2, 3, 2,…
## $ Eff <dbl> 6.566250, 2.367010, 10.367605, 3.998448, 12.934986…
## $ TrialNumber <dbl> 2, 3, 5, 6, 8, 9, 11, 14, 15, 18, 19, 21, 22, 28, …
## $ PrevAn <dbl> 0.98450637, 0.82035722, 0.48565959, 0.09090184, 0.…
## $ Effort_Change_Ratio <dbl> 1.5116202, 0.3604812, 1.8857553, 0.3856675, 2.1021…
## $ B <bs[,9]> <bs[26 x 9]>
The B matrix is now a matrix column which contains the same number of rows as the others, but has also 9 columns within that columns. Each of them correspond to one synthetic B variable.
We can now use this matrix to fit our model
h2.s1 <-
brm(data = d2,
family = gaussian,
Effort_Change_Ratio ~ 1 + B,
prior = c(prior(normal(100, 10), class = Intercept),
prior(normal(0, 10), class = b),
prior(exponential(1), class = sigma)),
iter = 4000, warmup = 2000, chains = 4, cores = 4,
seed = 4)
# Add criterions for later diagnostics
h2.s1 <- add_criterion(h2.s1, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.s1_R2 <- bayes_R2(h2.s1)
# Save both as objects
saveRDS(h2.s1, here("09_Analysis_Modeling", "models", "h2.s1.rds"))
saveRDS(h2.s1_R2, here("09_Analysis_Modeling", "models", "h2.s1_R2.rds"))
Model summary
h2.s1 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s1.rds"))
h2.s1_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s1_R2.rds"))
# Summary
print(h2.s1)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + B
## Data: d2 (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.35 3.28 -5.11 7.94 1.00 954 1238
## B1 0.12 3.28 -6.48 6.57 1.00 956 1246
## B2 0.16 3.29 -6.46 6.65 1.00 956 1255
## B3 0.38 3.28 -6.24 6.84 1.00 954 1248
## B4 -0.02 3.29 -6.63 6.46 1.00 956 1242
## B5 0.08 3.28 -6.53 6.54 1.00 956 1260
## B6 -0.17 3.29 -6.75 6.34 1.00 954 1242
## B7 -0.15 3.29 -6.76 6.31 1.00 958 1277
## B8 -0.21 3.29 -6.83 6.28 1.00 953 1275
## B9 -0.29 3.29 -6.92 6.19 1.00 955 1242
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.73 0.01 0.71 0.75 1.00 3635 3204
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
It’s a bit difficult to see what is going on so let’s just plot it
Now with the predictor
So as expected, we see quite linear decrease similar to our previous
models
Now let’s check some diagnostics
plot(h2.s1)
# looks good
pp_check(h2.s1, type = "dens_overlay")
# because we used only main predictor, we can see quite a bad result of the posterior predictive distribution, but we will work on that
pp_check(h2.s1, type = "error_scatter_avg")
# high residual error for both low and high values
h2.s1_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.04744733 0.007830488 0.03295914 0.06351053
# explained variance 4%
Now we will use a different workflow, using smooth functions with brms. brms allow for non-linear models by borrowing functions from mgcv package (REF: Wood)
This is how priors should look like for one of these functions
get_prior(data = d2,
family = gaussian,
Effort_Change_Ratio ~ 1 + s(PrevAn))
## prior class coef group resp dpar nlpar lb ub
## (flat) b
## (flat) b sPrevAn_1
## student_t(3, 1.7, 2.5) Intercept
## student_t(3, 0, 2.5) sds 0
## student_t(3, 0, 2.5) sds s(PrevAn) 0
## student_t(3, 0, 2.5) sigma 0
## source
## default
## (vectorized)
## default
## default
## (vectorized)
## default
At this point, I will just keep priors at default
h2.s2 <-
brm(data = d2,
family = gaussian,
Effort_Change_Ratio ~ 1 + s(PrevAn, bs = "bs", k = 19),
iter = 4000,
warmup = 2000,
chains = 4,
cores = 4,
seed = 4,
control = list(adapt_delta = .99)
)
# Add criterions for later diagnostics
h2.s2 <- add_criterion(h2.s2, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.s2_R2 <- bayes_R2(h2.s2)
# Save both as objects
saveRDS(h2.s2, here("09_Analysis_Modeling", "models", "h2.s2.rds"))
saveRDS(h2.s2_R2, here("09_Analysis_Modeling", "models", "h2.s2_R2.rds"))
Now we can proceed as we usually do
h2.s2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s2.rds"))
h2.s2_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s2_R2.rds"))
# Summary
summary(h2.s2)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + s(PrevAn, bs = "bs", k = 19)
## Data: d2 (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Smoothing Spline Hyperparameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sds(sPrevAn_1) 0.01 0.01 0.00 0.03 1.00 1526 2979
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.35 0.01 1.33 1.38 1.00 10258 6031
## sPrevAn_1 -0.08 0.01 -0.09 -0.06 1.00 6284 5193
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.73 0.01 0.71 0.75 1.00 11605 5819
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(h2.s2)
# looks good
plot(conditional_effects(h2.s2), points = TRUE)
# again, we see similar (non-linear) trend
pp_check(h2.s2, type = "dens_overlay")
# same as before, ignoring bimodality
pp_check(h2.s2, type = "error_scatter_avg")
# still weird
h2.s2_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.04336834 0.007618562 0.02908593 0.05921277
# explained variance 4%
Now let’s use the same model, but adding our other predictors
h2.s3 <-
brm(
data = filtered_data,
family = gaussian,
Effort_Change_Ratio ~ 1 +
s(PrevAn_z, bs = "bs", k = 19) + # Smooth for Previous Answer similarity
+ CommAtt + Modality + Big5 + Familiarity + Expressibility_z + # Fixed effects
(1 | Participant) + (1 | Concept), # Random effects
iter = 4000,
warmup = 2000,
chains = 4,
cores = 4,
seed = 4,
control = list(adapt_delta = .99)
)
# Add criterions for later diagnostics
h2.s3 <- add_criterion(h2.s3, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.s3_R2 <- bayes_R2(h2.s3)
# Save both as objects
saveRDS(h2.s3, here("09_Analysis_Modeling", "models", "h2.s3.rds"))
saveRDS(h2.s3_R2, here("09_Analysis_Modeling", "models", "h2.s3_R2.rds"))
h2.s3 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s3.rds"))
h2.s3_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s3_R2.rds"))
# Summary
summary(h2.s3)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + s(PrevAn_z, bs = "bs", k = 19) + +CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 | Participant) + (1 | Concept)
## Data: filtered_data (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Smoothing Spline Hyperparameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sds(sPrevAn_z_1) 0.00 0.00 0.00 0.00 1.00 2278 3567
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3314 3910
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3210 4292
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.11 0.00 1.10 1.11 1.00 11236 6961
## CommAtt2M1 -1.54 0.00 -1.55 -1.53 1.00 21880 5650
## Modality1 0.00 0.00 -0.01 0.01 1.00 15488 6225
## Modality2 -0.01 0.01 -0.02 0.00 1.00 15124 6351
## Big5 -0.00 0.00 -0.01 0.01 1.00 19293 6143
## Familiarity 0.00 0.00 -0.00 0.01 1.00 20031 5691
## Expressibility_z 0.00 0.00 -0.00 0.01 1.00 14573 5746
## sPrevAn_z_1 -0.08 0.00 -0.08 -0.08 1.00 7024 6935
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.09 0.00 0.09 0.09 1.00 21128 5955
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(h2.s3)
# looks ok
plot(conditional_effects(h2.s3), points = TRUE)
# now the ffect of PrevAn looks a bit different because we are back to using z-scored version
pp_check(h2.s3, type = "dens_overlay")
# looks ok but the ppd undershoots the low values and overshoot the high values
pp_check(h2.s3, type = "error_scatter_avg")
# still kind of weird, and high values positively correlated
h2.s3_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9861186 7.755798e-05 0.9859575 0.9862589
# 98% variance
As we see in the ppcheck, we are still a bit struggling with the distribution, so let’s try once again a lognormal one
h2.s4 <-
brm(
Effort_Change_Ratio ~ 1 +
s(PrevAn_z, bs = "bs", k = 19) + # Nonlinear effect for Previous Answer similarity
+ CommAtt + Modality + Big5 + Familiarity + Expressibility_z +
(1 + PrevAn_z || Participant) + # Random slopes and intercepts for Participant
(1 + PrevAn_z || Concept) + # Random slopes and intercepts for Concept
(1 | TrialNumber_c), # Random intercept for Trial Number
data = filtered_data,
family = lognormal(),
iter = 4000,
warmup = 2000,
chains = 4,
cores = 4,
seed = 4,
control = list(adapt_delta = .999,
max_treedepth = 12)
)
# Add criterions for later diagnostics
h2.s4 <- add_criterion(h2.s4, criterion = c("loo", "waic"))
# Calculate also variance explained (R^2)
h2.s4_R2 <- bayes_R2(h2.s4)
# Save both as objects
saveRDS(h2.s4, here("09_Analysis_Modeling", "models", "h2.s4.rds"))
saveRDS(h2.s4_R2, here("09_Analysis_Modeling", "models", "h2.s4_R2.rds"))
h2.s4 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s4.rds"))
h2.s4_R2 <- readRDS(here("09_Analysis_Modeling", "models", "h2.s4_R2.rds"))
# Summary
summary(h2.s4)
## Family: lognormal
## Links: mu = identity; sigma = identity
## Formula: Effort_Change_Ratio ~ 1 + s(PrevAn_z, bs = "bs", k = 19) + +CommAtt + Modality + Big5 + Familiarity + Expressibility_z + (1 + PrevAn_z || Participant) + (1 + PrevAn_z || Concept) + (1 | TrialNumber_c)
## Data: filtered_data (Number of observations: 2556)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Smoothing Spline Hyperparameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sds(sPrevAn_z_1) 0.00 0.00 0.00 0.00 1.00 2682 4039
##
## Multilevel Hyperparameters:
## ~Concept (Number of levels: 21)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 3801 4084
## sd(PrevAn_z) 0.00 0.00 0.00 0.01 1.00 3743 3720
##
## ~Participant (Number of levels: 120)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 1880 3012
## sd(PrevAn_z) 0.00 0.00 0.00 0.01 1.00 3870 4554
##
## ~TrialNumber_c (Number of levels: 49)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.00 0.00 0.00 0.01 1.00 4249 3836
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.24 0.00 -0.24 -0.23 1.00 10336 7006
## CommAtt2M1 -1.72 0.00 -1.73 -1.72 1.00 16906 6230
## Modality1 -0.00 0.00 -0.01 0.01 1.00 15101 6177
## Modality2 -0.00 0.00 -0.01 0.01 1.00 14856 6513
## Big5 -0.00 0.00 -0.01 0.00 1.00 13114 5925
## Familiarity -0.00 0.00 -0.01 0.00 1.00 13584 6054
## Expressibility_z 0.00 0.00 -0.00 0.00 1.00 9683 5907
## sPrevAn_z_1 -0.06 0.00 -0.06 -0.06 1.00 9083 7104
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.07 0.00 0.07 0.07 1.00 14570 5307
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(h2.s4)
# looks ok
plot(conditional_effects(h2.s4), points = TRUE)
# for main predictor ok, CommAtt has a strong affect as can be expected. The rest is probably moderated but that's okay for now
pp_check(h2.s4, type = "dens_overlay")
# but this looks very good
pp_check(h2.s4, type = "error_scatter_avg")
# so for the higher-values mode it looks ok (blob around 0), but the low values are still quite error-prone. At the same time, the scale of x-axis has quite small range
h2.s4_R2
## Estimate Est.Error Q2.5 Q97.5
## R2 0.9986769 0.0001261677 0.9983885 0.9988757
# 99% explained variance - we are quite likely overfitting here, but as mentioned above, this is perhaps expectable for synthetic data
Before we proceed further, let’s do first round of diagnostics
# Extract R-hat values for each model
rhat_list <- lapply(model_list, function(model) {
rhat_values <- rhat(model)
data.frame(model = deparse(substitute(model)),
max_rhat = max(rhat_values),
min_rhat = min(rhat_values))
})
# Combine and inspect
do.call(rbind, rhat_list)
## model max_rhat min_rhat
## 1 X[[i]] 1.003347 0.9999272
## 2 X[[i]] 1.003156 0.9997275
## 3 X[[i]] 1.001528 1.0003296
## 4 X[[i]] 1.002482 1.0002600
## 5 X[[i]] 1.003105 0.9996580
## 6 X[[i]] 1.004370 0.9996854
All Rhat values look good
Effective sample size tells how many independent samples the model has effectively drawn from the PD. Low ESS suggests autocorrelation (i.e., sample explores one part of posterior), while high ESS means good mix
# Extract n_eff values for each model
neff_ratio_list <- lapply(model_list, function(model) {
neff_values <- neff_ratio(model) # Here we calculate ratio (not the raw number of effective samples)
data.frame(model = deparse(substitute(model)),
min_neff = min(neff_values),
max_neff = max(neff_values),
mean_neff = mean(neff_values))
})
# Combine and inspect
do.call(rbind, neff_ratio_list)
## model min_neff max_neff mean_neff
## 1 X[[i]] 0.3064723 0.9110065 0.7814734
## 2 X[[i]] 0.2718387 0.9254677 0.7632834
## 3 X[[i]] 0.1190914 0.4004991 0.1717864
## 4 X[[i]] 0.1907208 0.8094037 0.5994023
## 5 X[[i]] 0.2629642 0.8886650 0.7698058
## 6 X[[i]] 0.2350103 0.9275640 0.7720078
They all look good expect the very first non-linear model h2.s1 which is quite expectable since we used only main predictor
effective_sample(h2.m1)
## Parameter ESS
## 1 b_Intercept 12515
## 2 b_PrevAn_z 15913
## 3 b_CommAtt2M1 21729
## 4 b_Familiarity 17809
## 5 b_Big5 17830
## 6 b_Expressibility_z 15791
## 7 b_TrialNumber_c 8061
## 8 b_Modality1 14644
## 9 b_Modality2 13812
effective_sample(h2.m2)
## Parameter ESS
## 1 b_Intercept 9992
## 2 b_PrevAn_z 7925
## 3 b_CommAtt2M1 18072
## 4 b_Modality1 15601
## 5 b_Modality2 14816
## 6 b_Big5 13249
## 7 b_Familiarity 13571
## 8 b_Expressibility_z 9882
effective_sample(h2.s4) # this one probably looks the nices
## Parameter ESS
## 1 b_Intercept 10266
## 2 b_CommAtt2M1 17155
## 3 b_Modality1 15003
## 4 b_Modality2 14686
## 5 b_Big5 13033
## 6 b_Familiarity 13764
## 7 b_Expressibility_z 9603
## 8 bs_sPrevAn_z_1 9100
effective_sample(h2.s3)
## Parameter ESS
## 1 b_Intercept 11182
## 2 b_CommAtt2M1 21782
## 3 b_Modality1 15420
## 4 b_Modality2 15244
## 5 b_Big5 19494
## 6 b_Familiarity 19777
## 7 b_Expressibility_z 14408
## 8 bs_sPrevAn_z_1 7070
Leave-one-out (loo) validation
l <- loo_compare(h2.m1, h2.m2, h2.s1, h2.s2, h2.s3, h2.s4, criterion = "loo")
print(l, simplify = F)
## elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
## h2.s4 0.0 0.0 3127.4 40.7 42.0 1.8 -6254.8 81.4
## h2.m2 -7.2 4.4 3120.2 40.6 34.8 1.4 -6240.4 81.1
## h2.s3 -564.5 54.3 2562.9 36.0 23.0 0.7 -5125.8 72.1
## h2.m1 -564.8 54.3 2562.7 36.2 20.1 0.6 -5125.3 72.5
## h2.s2 -5966.6 46.3 -2839.2 18.6 4.8 0.1 5678.4 37.2
## h2.s1 -5968.9 46.4 -2841.5 18.8 9.3 0.2 5682.9 37.6
Here we see that both lognormal models - linear and non-linear - seem to have the best diagnostics as measured by loo
Information criterion (WAIC)
w <- loo_compare(h2.m1, h2.m2, h2.s1, h2.s2, h2.s3, h2.s4, criterion = "waic")
print(w, simplify = F)
## elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic
## h2.s4 0.0 0.0 3127.5 40.7 41.9 1.8 -6255.1
## h2.m2 -7.2 4.4 3120.3 40.6 34.7 1.4 -6240.6
## h2.s3 -564.6 54.3 2563.0 36.0 23.0 0.7 -5125.9
## h2.m1 -564.9 54.3 2562.7 36.2 20.0 0.6 -5125.4
## h2.s2 -5966.7 46.3 -2839.2 18.6 4.8 0.1 5678.4
## h2.s1 -5969.0 46.4 -2841.5 18.8 9.3 0.2 5682.9
## se_waic
## h2.s4 81.4
## h2.m2 81.1
## h2.s3 72.1
## h2.m1 72.5
## h2.s2 37.2
## h2.s1 37.6
# see Solomon Kurz
cbind(waic_diff = w[,1] * -2,
se = w[,2] * 2)
## waic_diff se
## h2.s4 0.00000 0.000000
## h2.m2 14.44092 8.748315
## h2.s3 1129.18036 108.542331
## h2.m1 1129.72399 108.642706
## h2.s2 11933.49758 92.630038
## h2.s1 11938.00693 92.856272
Plot the comparison
Here we see identical results
model_weights(h2.m1, h2.m2, h2.s1, h2.s2, h2.s3, h2.s4, weights = "waic") %>%
round(digits = 2)
## h2.m1 h2.m2 h2.s1 h2.s2 h2.s3 h2.s4
## 0 0 0 0 0 1
For these data, GAMs are probably adding unnecessary complexity as we are not gaining much more explanatory power. However, we leave it open whether linear or non-linear models will be better suited for the real data, following this pipeline.
Similarly to H1, we will also set mildly informative priors. Let’s check what priors have been selected as default for h2.s4
# Print priors
prior_summary(h2.s4)
## prior class coef group
## (flat) b
## (flat) b Big5
## (flat) b CommAtt2M1
## (flat) b Expressibility_z
## (flat) b Familiarity
## (flat) b Modality1
## (flat) b Modality2
## (flat) b sPrevAn_z_1
## student_t(3, 0.5, 2.5) Intercept
## student_t(3, 0, 2.5) sd
## student_t(3, 0, 2.5) sd Concept
## student_t(3, 0, 2.5) sd Intercept Concept
## student_t(3, 0, 2.5) sd PrevAn_z Concept
## student_t(3, 0, 2.5) sd Participant
## student_t(3, 0, 2.5) sd Intercept Participant
## student_t(3, 0, 2.5) sd PrevAn_z Participant
## student_t(3, 0, 2.5) sd TrialNumber_c
## student_t(3, 0, 2.5) sd Intercept TrialNumber_c
## student_t(3, 0, 2.5) sds
## student_t(3, 0, 2.5) sds s(PrevAn_z, bs = "bs", k = 19)
## student_t(3, 0, 2.5) sigma
## resp dpar nlpar lb ub source
## default
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## default
## 0 default
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 (vectorized)
## 0 default
## 0 (vectorized)
## 0 default
For the flat priors that have been selected as default, we can re-use our previous priors from H1,. The priors could therefore look somewhat like this
priors_h2s5 <- c(
set_prior("normal(0,0.50)", class = "b", coef = "sPrevAn_z_1"),
set_prior("normal(0,0.50)", class = "b", coef = "CommAtt2M1"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality1"),
set_prior("normal(0,0.25)", class = "b", coef = "Modality2"),
set_prior("normal(0,0.25)", class = "b", coef = "Big5"),
set_prior("normal(0,0.25)", class = "b", coef = "Familiarity"),
set_prior("normal(0,0.25)", class = "b", coef = "Expressibility_z")
)
Because we are currently already reaching the ceiling of R^2, we are not going to add more interactions. However, we are not excluding the option of adding few interactions when using the real data. Possible interactions include:
PrevAn x Modality - similarity affects the change in effort differently (e.g., vocal might still require more effort)
PrevAn x Expressibility - similarity in relation to effort might matter only for highly expressible concepts, and low expressible concepts are difficult to express, so also difficult to exaggerate
Familiarity x PrevAn - if the guess is really bad, only very familiar people might be motivated enough to put more effort
Big x PrevAn - same like with familiarity
Similar to the workflow adopted in H1 modelling, we would fit two new models, one with priors and one with interactions, and perform another round of diagnostics to see which model seems to have the most predictive power.